## Inspiration
In a world driven by digital communication, millions of individuals with speech and hearing impairments face immense barriers to basic human interaction and digital access. Traditional software solutions are often fragmented, expensive, or completely ignore regional dialects. The true inspiration behind EchoAccess was to create a unified, frictionless, and zero-cost communication hub that empowers users with disabilities. We wanted to build a bridge of inclusion—allowing someone who is non-verbal or deaf to seamlessly express themselves and be understood instantly by anyone, anywhere in the world, without boundaries.
## What it does
EchoAccess is a full-stack, comprehensive assistive communication web ecosystem designed for speech and hearing-impaired individuals. It features an ultra-inclusive UI and operates through two core modules:
- The Speech Hub: A dynamic split-screen setup providing instant Text-to-Speech (for non-verbal users to speak aloud) and real-time Speech-to-Text transcription (acting as a live digital notepad for deaf users).
- Sign Language AI Vision: A camera-based interface that leverages client-side computer vision to track hand landmarks and instantly translate essential emergency signs and alphabets into text.
To ensure global and localized impact, the entire application is backed by a translation pipeline supporting 20 global and regional languages (including English, Urdu, and Sindhi). It also features a persistent Accessibility Toolbar that instantly toggles high-contrast modes, scales typography, and switches to Dyslexia-friendly fonts to support users with cognitive or reading difficulties.
## How we built it
The application was built using a modern, scalable full-stack web architecture:
- Frontend UI/UX: Developed using React, Vite, and styled dynamically with Tailwind CSS and Shadcn UI components to achieve a highly responsive, high-contrast, and beautiful interface.
- Core Accessibility Engines: Utilized the web browser's native
SpeechSynthesis APIandWeb Speech API (SpeechRecognition)to process low-latency audio inputs and outputs directly on the client side. - Computer Vision & AI: Integrated the Google MediaPipe Hands framework via client-side scripts to capture 21 3D hand-landmark coordinates from an HTML5 webcam stream, mapping them onto a local coordinate matrix for real-time sign language recognition.
- Backend & Database: Connected to Supabase for secure user state management and cloud logging. A custom database schema stores all session metrics, language choices, and
translation_history, updating a real-time activity feed layout.
## Challenges we ran into
One of the primary roadblocks was handling frame-by-frame flickering and accuracy inside the webcam video stream when predicting sign language gestures. The model would frequently jump between characters due to micro-movements of the hand. We overcame this by writing a custom client-side smoothing/debounce algorithm that delays confirmation until a hand posture remains stable for 500ms before pushing the text to the string buffer.
Additionally, orchestrating asynchronous backend calls to translate large amounts of real-time transcribed text across 20 distinct language locales without blocking the main UI thread required rigorous optimization of React hooks and loading states.
## Accomplishments that we're proud of
We are incredibly proud of building a solution that requires zero heavy server costs; by maximizing native browser APIs and edge-based client-side processing, the app remains fast and lightweight. Successfully mapping complex hand coordinates for critical emergency gestures and watching them flawlessly convert into text in real-time was a major breakthrough. Most importantly, creating a tool that can translate tech-heavy data seamlessly into widely underrepresented regional languages like Urdu and Sindhi makes this a truly inclusive project for marginalized communities.
## What we learned
This challenge deepened our technical expertise in managing browser-level media devices, real-time audio streams, and coordinate-based computer vision tracking in JavaScript. On a deeper level, we learned that true innovation in accessibility isn’t about adding a million complex, flashy features; it is about designing software with empathy, ensuring that clean typography, intuitive color contrasts, and stable layouts are prioritized right from the very first line of code.
## What's next for Echo Access
The prototype built during this hackathon is just the beginning. Our next steps for EchoAccess include:
- Developing a custom-trained, lightweight deep learning model to expand the Sign Language dictionary from basic alphabets to full sentence-level structures.
- Packaging the platform into a lightweight cross-platform mobile application using React Native to provide on-the-go mobility support.
- Integrating eye-tracking capabilities using open-source libraries (
WebGazer.js) to expand the hands-free navigation controls for users with severe physical motor impairments.
Log in or sign up for Devpost to join the conversation.