Show Me A Sign

Homepage
Progress Tracker
Lesson
Lesson Example (Flashcard)
Backend Example 1 (model and geom agree)
Backend Example 2 (model medium confidence + geom)
Backend Example 3 (model high confidence)

Inspiration

Show Me a Sign is the Duolingo-style platform dedicated to learning American Sign Language (ASL), since we realized that was an area many mainstream language apps like Duolingo and Babbel have overlooked. Whether you’re a total beginner or looking to refine your skills, “Show Me a Sign” offers a fun, modern, and interactive way to break language barriers.

What it does

By streaming a real-time video feed from our website, the backend uses a custom-trained CNN model to detect and assess your ASL gestures. We employ 2 methods to assess hand position and predict letters, a top-K evaluation from our CNN model, as well as a geometric-based pattern detection using OpenCV's finger detection. By combining the predictions from each method along with their respective confidence levels, we determine what letter the user is most likely displaying. Users can then practice their skills with letter-by-letter flashcards that allow them to move on when they correctly sign a letter.

How we built it

Frontend

React with TypeScript: We built a fully-typed, component-based architecture using React and TypeScript for type safety and improved developer experience
Vite: An extremely fast build tool providing instant HMR (Hot Module Replacement) for rapid development
TailwindCSS: Used for responsive, utility-first styling with custom animations and transitions
shadcn/ui: Implemented for accessible, reusable UI components that maintain consistent design language
React Router: Used for seamless navigation between lessons, progress tracking, and profile management

Backend & Integration

Supabase: Handles authentication, user profiles, and progress tracking with real-time database capabilities
Flask API: Custom Python backend hosting our machine learning models, processing video frames sent from the frontend
WebSocket Communication: Enables real-time bidirectional communication between browser and server for instant feedback

Machine Learning Pipeline

Custom CNN Model: Trained on ~25000 ASL hand sign images to recognize letters with high accuracy. We used a wireframe dataset to avoid biases due to skin color, hand shape/size, and also improve detection in varying light conditions.
Dual Recognition System: Innovative approach combining both:
- CNN-based predictions with confidence scoring
- Geometric pattern recognition using OpenCV's finger detection algorithms
Real-time Processing: Optimized to handle video streams at 30fps with minimal latency and output real-time predictions

Challenges we ran into

Our biggest challenge was finding an effective ASL hand sign detection model. Accuracy and speed were key—incorrect or slow recognition hurt the user experience. We tested prebuilt models and trained our own, ultimately combining a geometric method for instant feedback with a lightweight machine learning model for better accuracy.

Data transmission between the frontend and backend posed significant engineering challenges. Sending video frames for analysis while keeping the application responsive demanded an efficient communication protocol. We ultimately settled on a WebSocket-based approach with controlled frame rates to prevent overwhelming the server during peak usage.

Another hurdle was designing an intuitive UI to make the tool user-friendly. Real-time video processing added complexity, requiring latency optimization and cross-browser compatibility fixes. Efficient frontend-backend communication was also crucial; we used WebSockets with controlled frame rates to maintain responsiveness.

Additionally, we faced challenges in creating a learning experience that was both educational and engaging. Balancing difficulty progression, providing constructive feedback, and maintaining user motivation required multiple design iterations.

What's next for Show Me A Sign

We plan to expand our platform by incorporating a wider range of words and phrases into our lessons, making ASL learning more comprehensive and accessible. In addition to expanding vocabulary, we aim to introduce video lessons that demonstrate proper ASL techniques, ensuring users learn not just the signs but also the nuances of hand positioning, movement, and facial expressions that are essential to effective communication.

Beyond content expansion, we are exploring ways to enhance user engagement through interactive exercises and real-time feedback to help users refine their skills. We also hope to build a community aspect where learners can practice with each other and receive guidance from fluent signers.

Our ultimate goal is to create an immersive, intuitive, and educational experience that empowers users to learn ASL in a way that is both fun and effective.