SilentVoice_BD

Inspiration

Communication is a right, not a luxury. Yet for millions of people who rely on Bangla Sign Language (BdSL) as their primary means of communication, this right remains unfulfilled in our daily interactions. According to UNICEF, around 13.9 million people use BdSL as their main way of communication, but most of society doesn't understand it at all. This creates a massive communication barrier that isolates an entire community from participating fully in education, healthcare, employment, and social interactions.

In a world that's rapidly advancing in AI and technology, we realized that this communication gap represents one of the most pressing accessibility challenges in Bangladesh. We were inspired by the potential to leverage modern computer vision and machine learning to break down these barriers and create a more inclusive society where every voice can be heard.

What it does

SilentVoice_BD is a real-time Bangla Sign Language recognition and translation system that bridges the communication gap between the deaf/hard-of-hearing community and the hearing community. The system:

Real-time Translation: Converts BdSL gestures into Bangla text and speech instantly
Multi-input Support: Works with both uploaded video files and live webcam feeds
Interactive Learning: Provides a practice mode with scoring for users learning sign language
Continuous Improvement: Incorporates user feedback to improve translation accuracy
Accessibility Features: Offers downloadable transcripts and live group call subtitles

The platform serves different user types:

Anonymous users get limited video-to-text/speech translation and basic sign library access
Registered users enjoy unlimited translations, live webcam support, practice modes, interactive correction feedback, and downloadable transcripts
Admins can manage user accounts, configure access levels, manage AI models, and monitor system health

How we built it

Our technical architecture combines several cutting-edge technologies:

Dataset & Training

Utilized the BDSLW60 dataset for training our sign language recognition model
Implemented data preprocessing and augmentation techniques to improve model robustness
Created additional synthetic training data to expand vocabulary coverage

Machine Learning Architecture

Built a Bidirectional Long Short-Term Memory (BiLSTM) neural network for sequential gesture recognition
Implemented advanced sequence modeling to capture temporal dependencies in sign language gestures
Used transfer learning techniques to optimize performance with available data

Computer Vision Pipeline

Implemented MediaPipe for real-time hand and body pose detection
Developed custom preprocessing algorithms to normalize gesture data
Created a robust feature extraction system that captures the nuances of BdSL

Backend Development

Spring Boot Java framework for building a robust and scalable REST API
Designed comprehensive user management and authentication systems
Implemented secure endpoints for video processing and translation services
Built feedback collection and model improvement pipelines

Frontend Development

React.js for a responsive and interactive user interface
WebRTC for seamless webcam integration
Material-UI components for accessibility-first design
Real-time translation display with confidence scoring

Challenges we ran into

Data Limitations

Working with the BDSLW60 dataset presented unique challenges:

Limited vocabulary: The dataset contains only 60 words, requiring creative approaches to expand functionality
Data quality variations needed extensive preprocessing and cleaning
Regional signing variations within the dataset required normalization techniques

Model Architecture Complexity

Implementing BiLSTM for sign language recognition involved:

Sequence alignment challenges for variable-length gesture videos
Temporal feature extraction to capture the dynamic nature of sign language
Overfitting prevention with limited training data

Real-time Performance Optimization

Balancing model accuracy with inference speed for live translation
Memory management for continuous video processing
Latency optimization to ensure smooth user experience

Development Infrastructure

Spring Boot backend integration with machine learning models required custom solutions
Cross-platform compatibility for webcam access and video processing
API design for handling both file uploads and real-time video streams

Accomplishments that we're proud of

Technical Achievements

Successfully implemented BiLSTM architecture for BdSL recognition using BDSLW60 dataset
Created a functional Spring Boot backend with comprehensive API endpoints
Achieved real-time video processing capabilities for live translation
Developed a complete user management system with different access levels

System Features

Multi-user support with anonymous and registered user tiers
Interactive feedback system that allows users to correct translation errors
Practice mode with scoring for sign language learning
Comprehensive admin dashboard for system monitoring and model management

Innovation

Pioneered BiLSTM application for Bangla Sign Language recognition
Created an adaptive learning system that improves from user corrections
Developed a scalable architecture ready for deployment and expansion

What we learned

Technical Insights

BiLSTM networks are highly effective for capturing bidirectional temporal dependencies in sign language
Spring Boot provides excellent framework capabilities for building ML-integrated applications
Real-time video processing requires careful optimization of both model architecture and system resources

Dataset Management

Working with BDSLW60 taught us the importance of data quality over quantity
Data augmentation techniques are crucial when working with limited vocabulary datasets
Preprocessing pipelines significantly impact model performance in sign language recognition

Community Engagement

The importance of user feedback integration in assistive technology development
Iterative design based on actual user needs leads to better accessibility solutions
Scalable user management is essential for growing accessibility platforms

What's next for SilentVoice_BD

Immediate Deployment Goals

Deploy the Spring Boot backend to cloud infrastructure (AWS/Google Cloud)
Production testing with real users from the deaf community
Performance optimization for handling concurrent users
Mobile app development for Android and iOS platforms

Model Enhancement

Expand vocabulary beyond the BDSLW60 dataset to include 500+ common signs
Improve BiLSTM architecture with attention mechanisms for better accuracy
Implement ensemble methods combining multiple model approaches
Add contextual understanding for more natural translations

Feature Development

Bidirectional translation: Text/speech to BdSL using avatar generation
Group video call integration with real-time subtitles
Offline mode for areas with limited internet connectivity
Advanced analytics for tracking learning progress and system usage

Community & Research

Partner with deaf education institutions for wider adoption
Open-source the BDSLW60 processing pipeline for research community
Expand to other regional sign languages in South Asia
Publish research findings on BiLSTM applications in sign language recognition

SilentVoice_BD represents our commitment to leveraging technology for social good. By combining the BDSLW60 dataset with BiLSTM architecture and Spring Boot infrastructure, we're building a foundation for truly inclusive communication in Bangladesh.

Built With

2.0
api
architecture
authentication
authorization
bdslw60
bidirectional
bilstm)
boot
caching
communication
components
database
dataset
detection
frontend
integration
java)
json
jwt
lstm
management
material-ui
mediapipe
model
oauth
pose
postgresql
python/tensorflow
react.js
real-time
redis
restful
secure
session
spring
tokens)
training
user
web
webcam
webrtc
websocket

Updates

Sayad Ibne azad started this project — Aug 25, 2025 10:01 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.