In India, millions of deaf and hard-of-hearing individuals struggle with communication because there is no real-time translation between spoken language and Indian Sign Language (ISL). This drove us to create an AI-based system that addresses this issue and encourages inclusive communication. Our goal is to help deaf individuals engage with education, healthcare, and social interactions more easily. We want to empower them to communicate on their own
The Audio to Indian Sign Language Converter picks up spoken words through a microphone, processes the speech using AI-based speech recognition, and shows the related ISL gestures using an animated avatar. It works as a real-time communication link between hearing and non-hearing individuals, turning voice input into visual sign language output.
- Speech Recognition: We used APIs like Google Speech API and Whisper for accurate audio transcription.
- Natural Language Processing (NLP):We utilized spacy and Hugging Face Transformers to understand spoken phrases and match them to ISL signs.
- Computer Vision & Animation :We created gesture animations using OpenCV, Media Pipe, and Blender.
- Frontend & Backend: We developed a web app with React.js and Flask, connecting it through MongoDB for data management.
- Testing: We improved real-time performance and tested the avatar’s clarity and response speed.
Limited datasets for mapping Indian Sign Language.
- Synchronizing audio input and avatar animation with low latency.
- Designing realistic and smooth hand gestures.
- Integrating multiple AI modules smoothly with limited hardware resources.
We built a working prototype that translates speech to ISL gestures in real time.
- We developed an inclusive platform that supports accessibility and awareness.
- We learned how to combine multiple technologies—speech, NLP, and vision—into a single system.
- We created a user-friendly web interface that works on both desktop and mobile.
We gained practical experience with:
- Speech and text processing using AI models.
- 3D animation and computer vision for gesture rendering.
- Integrating backend and frontend for real-time use.
Understanding the social and emotional effects of inclusive technology.
We plan to expand the ISL gesture database to include regional variations.
We aim to add reverse translation—converting ISL gestures to speech for two-way communication.
We want to launch a mobile app version to reach a broader audience.
We will collaborate with institutions that support the deaf community to enhance dataset quality.
We are exploring AI-driven avatar personalization to make gestures more expressive.
Built With
- blender
- css
- django
- firebase
- flask
- google-cloud
- googlespeechapi
- html
- huggingfacetransformers
- javascript
- jupyter
- mediapipe
- mongodb
- nltk
- node.js
- opencv
- python
- pytorch
- react.js
- spacy
- tensorflow
- visual-studio
- vosk
- whisper
Log in or sign up for Devpost to join the conversation.