Inspiration
Our team was inspired by the significant multimedia accessibility gap that exists for the hearing impaired. We recognized that while visual content is often available, the rich emotional and contextual information conveyed through sound is largely inaccessible. We wanted to create a solution that would provide a sensory experience equivalent to auditory perception, allowing users to "feel" the sound and emotions in multimedia content.
What it does
Our project, the Haptic Vibration Device, converts audio-visual content into meaningful, structured vibration patterns. It creates a multi-dimensional tactile language that conveys both the nature and emotional content of sounds. This allows hearing-impaired individuals to experience rich multimedia content by feeling distinct vibration patterns that correspond to different event types (like rumble, impact, pulse) and varying intensities that reflect emotional content and urgency.
How we built it
We built our solution through a multi-step pipeline, integrating various AI technologies:
Visual Scene Analysis: We used the Runware API to extract visual descriptions and key elements from scenes, helping us recognize events that often produce sounds.
Audio Analysis: We integrated ElevenLabs for frame-by-frame dialogue and background music extraction, and Hugging Face models to analyze the emotional context of the audio.
Classification & Preprocessing: The Gemini LLM played a crucial role here, analyzing and processing the combined visual and audio data to prepare structured vibration patterns, including an Emotion Hex Model.
Challenges we ran into
We faced several challenges during development:
Quality of Vibration Data: Ensuring the vibration output accurately and effectively conveyed the nuances of the original sound was a significant hurdle.
Analysis Step Consistency: Maintaining consistency across each step of the analysis pipeline, from visual to audio processing, proved complex.
LLM Inconsistency: The inconsistency of LLMs across different custom content types required careful fine-tuning and iteration to achieve reliable results.
Accomplishments that we're proud of
We are particularly proud of having created our own haptic rule algorithm. This custom algorithm is the core of our innovation, allowing us to successfully transform complex audio-visual data into meaningful and distinct tactile feedback patterns. We're also proud of integrating multiple AI technologies to achieve our goal.
What we learned
Through this project, we learned that there is immense potential in creating innovative haptic rule algorithms and that a clear demonstration of this potential can truly showcase the power of AI-powered personalized tactile feedback. We also realized the complexity and difficulty in fully implying all the elements from the sound into a haptic experience, highlighting that there is much more we can do for the hearing impaired community.
What's next for Haptic Device for Hearing Impaired
Our next step is to train our custom rules into the model, eventually leading to AI-generated haptic rule criteria. This will allow for further refinement of our algorithm and expand its content compatibility, making the device even more versatile and effective. We also plan to explore further hardware and content expansion.
Built With
- css
- elevenlab
- html
- huggingface
- javascript
- opencv
- python
- runware
- timescript
Log in or sign up for Devpost to join the conversation.