Spidey Sense

Spidey Senses
Tech Stack
My Spidey Senses are tingling!
My Spidey Senses are tingling!

🕷️ Spidey Sense

Navigate the world, fearlessly.

AI-powered navigation assistant that helps visually impaired users explore their surroundings through real-time detection, spatial awareness, and conversational guidance.

🌎 Social Impact

More than 285 million people worldwide live with visual impairments. Traditional canes and GPS apps offer limited awareness — they can’t describe nearby obstacles or open paths.

Spidey Sense transforms independence by turning vision into voice, guiding users with real-time spatial awareness, conversational AI, and natural speech — empowering safer, more confident mobility.

🧠 Inspiration

We asked: “What if someone who can’t see could still sense the world like Spider-Man?”

Most navigation tools tell you where you are, not what’s around you. Visually impaired users often wonder:

“Is there something in front of me?”
“Can I move forward safely?”

So we built Spidey Sense — a friendly AI companion that sees, thinks, and speaks, helping users explore with awareness and trust.

💡 What It Does

🧠 Real-Time Object Detection — Detects people, chairs, doors, and obstacles using COCO-SSD.
🦯 Spatial Awareness Engine — Classifies objects into left, center, right zones for precise guidance.
🔊 Voice Synthesis (ElevenLabs) — Converts COCO SSD’s findings into lifelike speech.
🥐 Smart Timer — Periodically checks surroundings every second and guides user accordingly.
🪄 Multi-Mode Awareness — Switch between Explore, Focus, and Follow for different contexts.

🌟 Key Benefits

👁️ Vision → Voice — Narrates your environment in real time.
🦯 Safe Movement — Guides you away from dead ends and toward clear paths.
🧠 Conversational Insight — Natural dialogue, not robotic alerts.
🤱 Touch Expansion — Future-ready for haptic feedback integration.
🌍 Accessibility First — Voice-first, minimalistic design built for independence.

🚀 Use Cases

Pedestrian navigation for the visually impaired
Campus or indoor mobility for students
Elderly users navigating homes and care facilities
Assistive tech developers integrating multimodal AI

🛠️ How We Built It

🔥 Frontend

HTML, CSS, JavaScript for a voice-first UI
Web Speech API for push-to-talk and voice capture
Mock interface simulating object detection + Gemini dialogue
Auth0 integration for security
MATLAB plots visualizing the walking distances of friends in friendly competitions

⚙️ Backend

Node.js + Express for API routing
COCO-SSD model for live object detection
ElevenLabs API for lifelike voice output

🤖 AI & APIs

ElevenLabs TTS → natural speech output
COCO-SSD → real-time detection for 80+ object classes

🛇 Challenges We Overcame

🧩 Integrating three AI systems (vision + language + voice)
🎤 Managing latency in voice-triggered queries
🦯 Translating object positions into spatial guidance
🎧 Designing a calm and empathetic voice UX

🌺 Accomplishments

✅ Auth0 login capabilities to ensure user data remains secure
✅ End-to-end multimodal pipeline: Detection → Scene Summary → Speech Output
✅ Scene-aware conversational responses
✅ Periodic voice prompts (every second)
✅ Inclusive voice-first interface tested with real users

📚 What We Learned

Context > Detection: Users need actionable guidance, not raw data
Voice-first Design improves trust and usability
Multimodal AI bridges the gap between accessibility and autonomy
Accessibility = intuitive speech, minimal friction, and reliability

🚀 Next Steps

🦡 Integrate haptic belt feedback for spatial direction
🗺️ Add indoor navigation using AR markers
📱 Launch a mobile app (Flutter) with offline support
🧠 Integrate Gemini context memory for multi-turn conversations
🪧 Incorporate Optical Character Recognition (OCR) to read street signs, menus, labels, bus numbers, or product packaging.
🕶️ Integrate app with wearable interfaces like Meta Glasses for ease of use.

❤️ Why Spidey Sense

Spidey Sense empowers visually impaired users to move confidently and independently — combining sight, speech, and spatial intelligence into one assistive companion. It’s more than an app — it’s AI that helps you feel your surroundings.

Built With

coco-ssd
css3
elevenlabapi
elevenlabstts
figma
git
javascript
matlab
node.js
react

Submitted to

HackUTA 7: Enter the Hacker-Verse

Created by

I created the back-end. I implemented the computer vision portion using COCO SSD and telemetry results by maintaining the physics of the phones location as a time series.

Anthony Abubakar
noriacha
Mohtashim Syed
Abubakar Kassim