Hi, I'm Sarvarbek, and I'm building Voxly because I've seen how painfully lonely communication can feel. Imagine a child who can't speak or hear clearly—maybe from cerebral palsy or autism—sitting at a family dinner. They gesture excitedly about their day, but everyone just smiles awkwardly, nods, and moves on. The child feels invisible. Or picture an adult with ALS in a doctor's office: the doctor speaks quickly, the patient understands but can't respond easily, and vital details get lost. These aren't rare stories—they happen every day to millions. Deaf and non-verbal people often feel shut out of conversations, forced to rely on interpreters who aren't always available, or just stay silent to avoid the frustration. For decades, people have tried to fix this. Since the 1970s, researchers dreamed of sign language translation. We've seen smart gloves (like Sign-IO in Kenya), wearables, or one-way apps that turn signs to speech—but rarely the full circle: real-time signs to speech/text AND speech/text back to visual signs, all using just a simple camera, no extra hardware, and accessible to anyone with a phone or laptop. Many projects stayed academic, limited by data shortages, huge variability in gestures, lighting, speeds, accents—or never made it public because the two-way flow was too hard to get smooth and natural. Voxly is my small but passionate MVP attempt to change that. With a webcam or phone camera:

A non-verbal person imitates gestures or signs → Voxly watches, analyzes movements with AI (pose detection + my custom model), and instantly turns them into clear text on screen + spoken voice for the other person. The hearing person speaks → Voxly transcribes to text, then generates visual sign animations (avatar or key signs) so the first person truly understands. Or they just type → instant signs appear.

It's bidirectional, real-time, and simple—no interpreters needed for everyday chats. But Voxly isn't just translation. Inside the app, we'll add free courses: beginner sign language lessons for curious friends/family, and deeper dives to truly enter the Deaf/non-verbal world and build real empathy. This isn't perfect yet—it's an MVP—but it's born from wanting no one to feel "muted" again. I believe we're among the first to put true two-way, hardware-free, camera-based bridging + education together in one tool, powered by AI like Amazon Nova for better multimodal understanding. Let's make conversations feel human again. Thank you for reading my heart behind Voxly.

Built With

Share this project:

Updates