Inspiration The primary inspiration for Lumina Vision came from a simple question: Can we use high-speed AI to narrate the visual world for those who cannot see it?
Current accessibility tools are often slow, robotic, or siloed into specific apps. We wanted to build a "Neural Link"—a seamless, high-performance assistant that doesn't just list objects, but describes the world with narrative depth, detects immediate physical hazards, and provides tactile feedback through haptics.
What it does Lumina Vision is an AI-powered neural lens designed for the visually impaired.
Narrative Intelligence: Uses GPT-4o Vision to provide rich, descriptive narratives of surroundings. Safety Guardian: Real-time hazard detection that classifies risks (Safe, Caution, DANGER) and triggers haptic (vibration) alerts. Low-Latency Architecture: Client-side image compression and streaming APIs ensure responses feel instantaneous. Neural Personalization: Toggle between "Professional" and "Warm" voice profiles to suit the user's environment. Offline-First PWA: Can be installed as a native app for a browser-free, immersive experience. How we built it Lumina is built on a high-performance Clean Architecture for modern web apps:
Framework: Next.js with App Router and Edge Functions for global performance. AI Brain: OpenAI GPT-4o for vision analysis and real-time streaming chat. Voice Synthesis: OpenAI TTS-1 with dynamic voice profile switching. UI/UX: Custom "Neural Link" dark-mode aesthetic built with Tailwind CSS v4, Lucide icons, and Framer Motion for micro-animations. Sensory Hub: Web Audio API for native sound effects and the Vibration API for tactile haptic feedback. Challenges we faced Latency vs. Quality: Sending high-res images to AI is slow. We solved this by implementing client-side compression, downscaling frames to the "sweet spot" of resolution where the AI is still accurate but the payload is 80% smaller. Mobile Immersion: Web browsers often break the immersion for accessibility tools. We integrated PWA (Progressive Web App) standards to allow the app to be installed fullscreen, removing the address bar and giving it native-level control. Human-First Haptics: Designing a vibration system that communicates urgency without causing alarm fatigue required fine-tuning distinct pulse patterns. What we learned We learned that the difference between an "app" and an "assistant" is in the details—haptics, sound cues, and streaming responses aren't just "polish," they are the core experience for accessibility-focused software.
What's next for Lumina Vision Object Tracking: Active spatial tracking with stereo-audio to help users find specific items. Multi-Modal Memory: Remembering the layout of a room to provide "Spatial Navigation" instructions. Local LLMs: Integrating on-device models for basic offline safety features.
Log in or sign up for Devpost to join the conversation.