Inspiration SignSpeak was inspired by the communication gap faced by over 72 million deaf and hard-of-hearing individuals worldwide. Most existing sign language translation systems depend on cameras, cloud connectivity, or bulky hardware, making them expensive, privacy-invasive, and unreliable in low-light or offline conditions. We wanted to create a solution that works anywhere, respects user privacy, and is affordable for real-world adoption. Our goal was to break the “silent barrier” using Edge AI and wearable technology.
What it does SignSpeak is a privacy-first, camera-free smart glove that translates sign language gestures into natural speech in real time. It captures finger movements and hand orientation using flex sensors and an IMU, processes the data on-device using machine learning, and converts recognized gestures into spoken output. The system works offline, delivers responses in under 300 milliseconds, and ensures seamless communication without recording video or requiring internet access.
How we built it We built the glove using 5 flex sensors, an MPU6050 IMU, and an ESP32 microcontroller powered by a rechargeable battery. Sensor data is preprocessed and fed into a lightweight Random Forest model optimized for Edge AI deployment. The classified gestures are then refined into meaningful sentences and converted into speech using text-to-speech technology. The entire system is designed to run efficiently on-device without cloud dependency.
Challenges we ran into We faced challenges in sensor calibration, reducing noise in flex readings, and handling gesture variations across users. Optimizing the machine learning model to balance accuracy and low latency within ESP32 hardware limitations was another major hurdle. Ensuring reliable real-time performance required multiple iterations and testing.
Built With
- affordable
- ai
- and-capable-of-working-completely-offline.-this-led-us-to-design-a-wearable-smart-glove-that-uses-flex-sensors-and-an-imu-to-capture-finger-movements-and-hand-orientation
- and-inaccessible-in-areas-without-stable-internet.-we-wanted-to-build-a-solution-that-was-private
- and-machine-learning-into-a-cohesive-edge-ai-solution.-our-biggest-challenges-included-reducing-sensor-noise
- and-optimizing-performance-within-hardware-constraints.-ultimately
- api
- arduino
- c
- c++
- cloud-processing
- code
- connected-to-an-esp32-microcontroller-for-on-device-processing.-we-implemented-an-optimized-random-forest-model-to-classify-gestures-in-real-time-with-latency-under-300-milliseconds
- edge
- ensuring-natural-interaction.-we-also-integrated-ai-based-sentence-refinement-and-text-to-speech-output-to-convert-recognized-gestures-into-fluent-spoken-language.-throughout-the-project
- esp32
- flex
- forest
- gemini
- handling-variations-in-gestures-across-users
- ide
- imusensor
- intrusive-to-privacy
- jupyter
- mpu6050
- notebook
- or-expensive-hardware
- portable
- privacy-first-assistive-technology-that-empowers-deaf-individuals-with-seamless-communication-anywhere
- python
- random
- sensor
- sensor-calibration
- signspeak-represents-our-vision-of-inclusive
- tts
- vs
- we-learned-how-to-combine-embedded-systems
- which-makes-them-unreliable-in-low-light
Log in or sign up for Devpost to join the conversation.