Inspiration
Traditional gaming excludes many with motor disabilities, and American Sign Language(ASL) remains unfamiliar to most hearing people. We asked: what if learning ASL could be as engaging as playing a video game? By combining accessible webcam-only controls with ASL gesture recognition, we created an experience that educates while entertaining—making sign language learning feel like casting real magic.
What it does
The Last Vigil teaches American Sign Language through immersive gameplay. Players defend against gothic enemies by performing ASL alphabet gestures with their hands to cast spells, while gaze tracking aims attacks—no keyboard or mouse needed. Each correctly signed letter triggers magical effects, reinforcing muscle memory and sign recognition. The game recognizes all 26 ASL alphabet letters and progressively introduces them through increasingly challenging waves, transforming language education into an action-packed adventure.
How we built it
- Frontend: TypeScript + HTML5 Canvas rendering at 60fps with WebSocket streaming
- Backend: Python FastAPI on Vultr Cloud CPUs running our custom-trained ASL detection model
- Custom AI Model: Trained on 54,000+ ASL dataset images for accurate recognition of all 26 alphabet letters
- Architecture: Client handles rendering; server processes all AI and game logic
- Education Design: Gesture sequences spell words, with real-time visual feedback confirming correct signs
Challenges we ran into
Training a Robust ASL Model: Collecting and curating 54,000+ diverse ASL images across different hand sizes, skin tones, lighting conditions, and backgrounds was crucial. Fine-tuning the model to achieve high accuracy while maintaining real-time performance on all 26 letters required extensive experimentation.
Balancing Education and Engagement: Making sign practice feel exciting rather than tedious required careful game design—we added dramatic spell effects and progressive difficulty to maintain flow while ensuring educational value.
Real-time Performance: Processing high-resolution webcam frames through our custom detection model while maintaining low latency for responsive gameplay demanded aggressive optimization and efficient CPU utilization.
Accomplishments that we're proud of
- Trained a custom ASL detection model on 54,000+ images achieving high accuracy across all 26 alphabet letters
- Made ASL learning genuinely fun through gamification with immediate visual rewards
- Created fully accessible gameplay requiring zero traditional game controllers
- Achieved 60fps performance while processing dual AI models (gaze + gesture recognition)
- Built scalable cloud architecture that handles intensive computer vision on CPU servers
What we learned
Training custom computer vision models for education demands extensive, diverse datasets—we learned that model accuracy depends heavily on data quality and variety. Balancing AI inference speed with accuracy taught us critical optimization techniques. TypeScript's strong typing proved invaluable for maintaining complex game state, and deploying ML models on cloud CPUs required careful consideration of cost versus performance tradeoffs.
What's next for The Last Vigil
- ASL Words and Phrases: Expand beyond fingerspelling to teach complete signs and common phrases
- Multilingual Sign Languages: Extend model training to support other sign language systems
- Community Challenges: Multiplayer modes where players collaborate using sign language
Built With
- fastapi
- html5
- mediapipe
- opencv
- python
- typescript
- vite
Log in or sign up for Devpost to join the conversation.