πŸ•·οΈ Spidey Sense

Navigate the world, fearlessly.

AI-powered navigation assistant that helps visually impaired users explore their surroundings through real-time detection, spatial awareness, and conversational guidance.


🌎 Social Impact

More than 285 million people worldwide live with visual impairments. Traditional canes and GPS apps offer limited awareness β€” they can’t describe nearby obstacles or open paths.

Spidey Sense transforms independence by turning vision into voice, guiding users with real-time spatial awareness, conversational AI, and natural speech β€” empowering safer, more confident mobility.


🧠 Inspiration

We asked: β€œWhat if someone who can’t see could still sense the world like Spider-Man?”

Most navigation tools tell you where you are, not what’s around you. Visually impaired users often wonder:

β€œIs there something in front of me?”
β€œCan I move forward safely?”

So we built Spidey Sense β€” a friendly AI companion that sees, thinks, and speaks, helping users explore with awareness and trust.


πŸ’‘ What It Does

🧠 Real-Time Object Detection β€” Detects people, chairs, doors, and obstacles using COCO-SSD.
🦯 Spatial Awareness Engine β€” Classifies objects into left, center, right zones for precise guidance.
πŸ”Š Voice Synthesis (ElevenLabs) β€” Converts COCO SSD’s findings into lifelike speech.
πŸ₯ Smart Timer β€” Periodically checks surroundings every second and guides user accordingly.
πŸͺ„ Multi-Mode Awareness β€” Switch between Explore, Focus, and Follow for different contexts.


🌟 Key Benefits

πŸ‘οΈ Vision β†’ Voice β€” Narrates your environment in real time.
🦯 Safe Movement β€” Guides you away from dead ends and toward clear paths.
🧠 Conversational Insight β€” Natural dialogue, not robotic alerts.
🀱 Touch Expansion β€” Future-ready for haptic feedback integration.
🌍 Accessibility First β€” Voice-first, minimalistic design built for independence.


πŸš€ Use Cases

  • Pedestrian navigation for the visually impaired
  • Campus or indoor mobility for students
  • Elderly users navigating homes and care facilities
  • Assistive tech developers integrating multimodal AI

πŸ› οΈ How We Built It

πŸ”₯ Frontend

  • HTML, CSS, JavaScript for a voice-first UI
  • Web Speech API for push-to-talk and voice capture
  • Mock interface simulating object detection + Gemini dialogue
  • Auth0 integration for security
  • MATLAB plots visualizing the walking distances of friends in friendly competitions

βš™οΈ Backend

  • Node.js + Express for API routing
  • COCO-SSD model for live object detection
  • ElevenLabs API for lifelike voice output

πŸ€– AI & APIs

  • ElevenLabs TTS β†’ natural speech output
  • COCO-SSD β†’ real-time detection for 80+ object classes

πŸ›‡ Challenges We Overcame

🧩 Integrating three AI systems (vision + language + voice)
🎀 Managing latency in voice-triggered queries
🦯 Translating object positions into spatial guidance
🎧 Designing a calm and empathetic voice UX


🌺 Accomplishments

βœ… Auth0 login capabilities to ensure user data remains secure
βœ… End-to-end multimodal pipeline: Detection β†’ Scene Summary β†’ Speech Output
βœ… Scene-aware conversational responses
βœ… Periodic voice prompts (every second)
βœ… Inclusive voice-first interface tested with real users


πŸ“š What We Learned

  • Context > Detection: Users need actionable guidance, not raw data
  • Voice-first Design improves trust and usability
  • Multimodal AI bridges the gap between accessibility and autonomy
  • Accessibility = intuitive speech, minimal friction, and reliability

πŸš€ Next Steps

🦑 Integrate haptic belt feedback for spatial direction
πŸ—ΊοΈ Add indoor navigation using AR markers
πŸ“± Launch a mobile app (Flutter) with offline support
🧠 Integrate Gemini context memory for multi-turn conversations
πŸͺ§ Incorporate Optical Character Recognition (OCR) to read street signs, menus, labels, bus numbers, or product packaging.
πŸ•ΆοΈ Integrate app with wearable interfaces like Meta Glasses for ease of use.


❀️ Why Spidey Sense

Spidey Sense empowers visually impaired users to move confidently and independently β€” combining sight, speech, and spatial intelligence into one assistive companion. It’s more than an app β€” it’s AI that helps you feel your surroundings.

Built With

Share this project:

Updates