Sankalp — Multimodal Accessibility for DeafBlind Users

Inspiration

Roughly 2 million people worldwide are DeafBlind — they cannot rely on sight or sound. Existing assistive technologies serve one sense at a time: screen readers for the blind, captioning for the deaf. Almost nothing fuses both modalities into a single touch-first interface.

I wanted to build a system where a DeafBlind user could feel their environment in real time:

A smoke alarm → long pulse on the wrist
A doorbell → distinct rhythm
A sign-language wave → Braille word on a display

The goal was a working prototype — not just a concept.

What It Does

Sankalp is a real-time multimodal accessibility engine with three parallel inputs:

Microphone → Whisper (speech) + YAMNet (sound classification)
Webcam → MediaPipe (sign-language gesture detection)
Knowledge queries → Wolfram Alpha + Gemini

All inputs are converted into a unified SemanticEvent:

type, content, urgency (0–10), source, timestamp

Outputs

6-dot animated Braille grid (Grade-1 UEB)
Haptic vibration patterns (via phone)

Haptic Grammar (Core Innovation)

Pattern (ms)	Meaning	Trigger
`[2000]`	Emergency	urgency ≥ 9
`[120,80,120,80,400]`	Name being called	speech
`[100,80,100,80,100]`	Doorbell / knock	sound
`[600]`	Sign detected	vision
`[80]`	Default notification	other

Rule: Urgency overrides everything.

How I Built It

Backend (Python + FastAPI)

Event system with SemanticEvent
Audio pipeline (Whisper + YAMNet)
Vision pipeline (MediaPipe)
Braille encoder (custom Python)
Haptic encoder (pattern mapping)
WebSocket system with real-time streaming

Frontend (Next.js + TypeScript)

Animated Braille grid (SVG)
Haptic visualizer
Event feed with urgency colors
Knowledge query panel
Emergency full-screen alert

Challenges

liblouis issues on Windows → built custom Braille encoder
No tflite-runtime support → switched to ai-edge-litert
MediaPipe API changes → migrated to Tasks API
Gemini rate limits → built retry + limiter system
Real-time tradeoff → freshness > completeness

Accomplishments

Unified SemanticEvent architecture
Designed Haptic Grammar (core innovation)
51 tests passing
Fully working end-to-end real-time system

What I Learned

Stub-first development works best
ML libraries change fast (breaking changes)
Urgency > confidence in accessibility systems
Free-tier APIs are enough if optimized

What’s Next

Expand sign-language recognition (20+ signs)
Add Grade-2 Braille
Support real Braille hardware (Bluetooth)
Build wearable version (wristband)
Enable full offline privacy mode

Built With

ai-edge-litert
asyncio
fastapi
faster-whisper
framer-motion
gemini
google-genai
httpx
mediapipe
next.js
numpy
opencv
pydantic
pytest
python
react
sounddevice
tailwindcss
typescript
uvicorn
web-vibration-api
webrtc
websockets
wolfram-technologies
yamnet

Updates

Abhay Singh started this project — Apr 26, 2026 04:16 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.