Speech <-> BSL

Inspiration

The deaf community often faces significant communication barriers in everyday life, from workplaces to healthcare settings. Existing solutions focus heavily on ASL, leaving a significant gap for BSL users. Inspired by the need for inclusivity, we aimed to create a tool that bridges the communication gap for BSL users while prioritizing accessibility, functionality, and security.

What It Does

Our project enables seamless two-way communication between speech and BSL (British Sign Language). It features:

•⁠ ⁠Speech-to-BSL: Converts spoken language into animated BSL gestures.

•⁠ ⁠BSL-to-Speech: Translates live BSL gestures into text or speech using a webcam. This multimodal tool requires no external devices, promoting ease of use and accessibility.

How We Built It

•⁠ ⁠Hand Gesture Recognition: Leveraged Mediapipe for hand tracking and extracted features for BSL gestures.

•⁠ ⁠Custom Model: Trained on over 25,000 images of BSL gestures to classify the alphabet accurately.

•⁠ ⁠Speech Recognition: Integrated speech-to-text functionality using Python libraries and APIs.

•⁠ ⁠Webcam Integration: Enabled live detection of hand gestures via OpenCV.

•⁠ ⁠Security: Ensured user data (webcam and voice) is discarded after processing and stored securely in AWS S3 when required.

Challenges We Ran Into

•⁠ ⁠Gesture Complexity: Handling subtle differences between BSL letters and two-hand interactions was technically challenging.

•⁠ ⁠Dataset Limitations: Existing datasets lacked sufficient real-world examples, necessitating augmentation and custom training.

•⁠ ⁠Real-Time Performance: Ensuring low latency for live gesture recognition and speech conversion posed optimization challenges.

Accomplishments That We're Proud Of

•⁠ ⁠Developed a fully functional prototype within a limited timeframe. •⁠ ⁠Achieved high accuracy in recognizing BSL gestures through a custom-trained model. •⁠ ⁠Created a user-friendly, accessible interface requiring only a webcam and microphone. •⁠ ⁠Prioritized security by discarding user data immediately after processing.

What We Learned

•⁠ ⁠Technical Skills: Improved our knowledge of computer vision, real-time processing, and machine learning. •⁠ ⁠User-Centric Design: Understood the importance of building intuitive solutions tailored to end-user needs. •⁠ ⁠Collaboration: Strengthened our ability to work as a team under pressure, leveraging each member's strengths.

What's Next for Minerva's Hackathon

•⁠ ⁠Enhanced Gesture Recognition: Add smoother animation blending and integrate lip-reading for improved accuracy. •⁠ ⁠Multilingual Support: Expand to other sign languages like ASL and ISL. •⁠ ⁠Scalability: Build a mobile-friendly version to ensure accessibility across platforms. •⁠ ⁠Community Engagement: Partner with organisations supporting the deaf community to refine the tool further.