Inspiration
Typing shouldn’t require hands. We wanted a zero-hardware, fully browser-based way for anyone to write, click, and navigate—using just their eyes and subtle head motion.
What it does
VisionKey turns your webcam into a gaze/blink controller:
Moves a soft “cursor” where you look.
Snaps to nearby keys and word suggestions for accuracy.
Selects with a dwell or intentional blink.
Types on an on-screen keyboard with lightweight next-word predictions.
Calibrates in under a minute and runs entirely on-device in the browser.
How we built it
Tracking: Google MediaPipe Face Landmarks detection.
Fusion: Eye vectors + head pose blended, then smoothed with a One-Euro filter.
UI: Magnetized keyboard, dwell ring, and a prediction bar. Small WebAudio beeps confirm actions.
Calibration: Quadratic mapping from feature space → screen coordinates; saved locally for instant reuse.
Safety: Blink hysteresis, cooldowns, and distance checks to avoid accidental clicks.
Challenges we ran into
CORS/hosting: Loading models reliably across dev servers and HTTPS.
Backend variance: WebGPU/WebGL/WASM capability differs by device; needed robust fallback.
Blink robustness: Preventing cursor jumps during partial blinks and handling false positives.
UX tuning: Balancing magnet strength, dwell timing, and prediction placement so it feels “sticky” but not frustrating.
Accomplishments that we're proud of
A fully local, no-install eye keyboard that runs in a tab.
Smooth, low-latency gaze with accidental-click prevention (snap radius + distance gate + cooldown).
Quick calibration that meaningfully improves accuracy across users and lighting.
What we learned
Small UX details (snap/“magnet,” dwell stability windows, audio ticks) matter more than raw ML accuracy.
Browser ML is viable for assistive tech if you design for fallbacks and graceful degradation.
Calibration mapping > generic heuristics—personalization beats extra model complexity.
What’s next for VisionKey
Personal language model for stronger predictions and corrections.
Adaptive calibration that updates passively while you type.
Symbols/emoji & navigation layer (scroll, drag, select, copy/paste).
PWA / desktop wrapper for kiosk and offline use.
Accessibility studies with diverse users to refine thresholds and ergonomics.


Log in or sign up for Devpost to join the conversation.