XAAC: Extended-Reality Augmentative and Alternative Communication
๐ก Inspiration & Context
AAC (Augmentative and Alternative Communication) encompasses "all the ways someone communicates beside talking" and is used by 2.5+ million people in the US alone - representing up to 1% of the population. This includes individuals with autism, cerebral palsy, Parkinson's disease, ALS, apraxia, and many other conditions affecting speech.
Current AAC solutions face significant limitations:
- Context unawareness: Devices don't understand the user's environment
- Slow communication: Navigating hierarchical symbol tables is time-consuming
- Device dependency: Requires tablets and manual interaction
- Gesture limitations: Not suitable for users with severe motor impairments
- Cognitive barriers: Difficulty learning symbol-to-object associations
๐ What XAAC Does
XAAC revolutionizes AAC communication by bringing symbols into Mixed Reality and making communication context-aware. Here's how it works:
๐ Real-time Object Detection & Symbol Mapping
- Uses Florence-2 AI to detect objects in the user's environment through passthrough camera
- Automatically fetches corresponding AAC symbols from OpenSymbols.org
- Places 3D AR symbols directly on detected real-world objects
- Creates instant visual associations between symbols and physical items
๐ฃ๏ธ Intelligent Sentence Building
- Users select symbols by gazing at objects in their environment
- Selected symbols are added to a "Sentence Pot" - a visual sentence builder
- Custom ML algorithm provides contextual word suggestions based on current symbols
- Supports both affirmative and interrogative sentence construction
๐ Seamless Translation & Communication
- Converts AAC symbol sentences into natural English using GPT-2
- Text-to-speech output for immediate verbal communication
- Real-time companion app receives messages for bidirectional communication
- Enables hands-free interaction for users with motor impairments
๐ฏ Context-Aware Learning
- Eliminates need to navigate complex symbol hierarchies
- Symbols appear where and when needed in the real world
- Accelerates learning curve by creating direct symbol-object associations
- Adapts suggestions based on environmental context
๐ ๏ธ Tech Stack Credits
AI & Computer Vision
- Microsoft Florence-2 - Zero-shot object detection
- NVIDIA Inference Server - AI model hosting
AAC & Communication
- OpenSymbols.org API - AAC symbol fetching
- GPT-2 via HuggingFace - AAC to natural language translation
- Android TTS - Text-to-speech conversion
XR & Connectivity
- Unity with Meta XR SDK - Mixed Reality development
- Meta Quest Passthrough Camera Access - Real-world object detection
- WebSocket - Real-time companion app communication
๐ฏ Challenges We Faced
Real-time Performance Optimization
- Challenge: Balancing AI inference speed with detection accuracy on mobile XR hardware
- Solution: Implemented intelligent frame sampling and result caching to maintain 30+ FPS while running Florence-2 detection
Symbol-Object Semantic Mapping
- Challenge: Bridging the gap between AI object labels and AAC symbol vocabulary
- Solution: Created a contextual mapping system that finds the closest AAC symbols for detected objects, with fallback mechanisms for unmapped items
Spatial UI Design for Accessibility
- Challenge: Designing intuitive 3D interfaces for users with diverse motor and cognitive abilities
- Solution: Implemented multiple interaction modalities (gaze, gesture, voice) with customizable UI scaling and placement
Context-Aware Word Prediction
- Challenge: Creating meaningful word suggestions that understand both linguistic and visual context
- Solution: Built a hybrid ML system combining n-gram analysis with object detection context for more relevant predictions
๐ฎ Future Features
Enhanced AI Capabilities
- Multi-modal scene understanding: Combine object detection with scene context, spatial relationships, and temporal patterns
- Emotion recognition: Detect facial expressions and body language to suggest appropriate emotional responses
- Activity recognition: Understand ongoing activities to provide relevant communication suggestions
Expanded Communication Modes
- Speech-to-symbols translation: Convert spoken language from others into visual symbols for better comprehension
- Multi-language support: Real-time translation between different languages and symbol systems
- Gesture recognition: Custom gesture creation for personalized communication shortcuts
Learning & Therapy Integration
- Gamified learning modules: Interactive exercises for symbol recognition and communication skills
- Progress tracking: Detailed analytics for therapists and caregivers
- Personalized curricula: AI-driven learning paths adapted to individual user needs and goals
Social & Collaborative Features
- Multiplayer therapy sessions: Shared virtual spaces for therapist-patient collaboration
- Family communication networks: Extended support for family members and caregivers
- Community symbol sharing: User-generated symbols and communication patterns
Hardware Integration
- Eye tracking optimization: Precise gaze-based selection for hands-free operation
- Brain-computer interfaces: Future integration with BCI technology for users with severe motor limitations
- Wearable device sync: Integration with smartwatches and other assistive devices
XAAC transforms AAC communication from a device-dependent, hierarchical system into an intuitive, context-aware experience that bridges the gap between the digital and physical worlds, making communication faster, more natural, and accessible to all.
Built With
- cs
- unity
- xr
Log in or sign up for Devpost to join the conversation.