A basic setup on the Raspberry Pi 5

Intelux

Inspiration

Over 1 billion people worldwide live with visual impairment, yet navigating everyday environments remains a largely unsolved challenge. Existing aids like white canes and guide dogs are helpful but limited. These aids don't tell you what is around you, only that something is there. We wanted to go further. Intelux was born from the desire to give visually impaired individuals a richer, more independent experience of the world around them — one where they can not only sense their surroundings, but understand and interact with them.

What it does

Intelux is an AI-powered wearable device — currently a camera mounted on a hat, with a smart glasses form factor as the end goal — that delivers real-time auditory descriptions of the user's environment.

It operates in three modes:

Environment Mode continuously processes the live camera feed through YOLO, detecting objects and narrating what is around the user (left, right, ahead) with a short interval between updates. This gives users a constant, real-time awareness of nearby objects as they move through a space; for example, being told how much money the user is holding, knowing that a chair is approaching on the left, or a door is straight ahead.

Interactive Mode allows users to ask specific questions about their surroundings, such as the size of nearby objects, what text is on a sign, or what color something is. The device captures a snapshot of the environment and passes it, along with the user's spoken question, to Claude (Anthropic's API), which returns a precise, context-aware answer played back through the speaker.

All three modes use voice input and audio output, keeping the experience completely hands-free.

How we built it

Hardware: Raspberry Pi with a camera module worn on a hat, with a speaker for audio output
Video capture: OpenCV for frame-by-frame video processing
Object detection: YOLOv8 (Ultralytics) for real-time environment detection in Environment Mode
Language model: Claude via the Anthropic API for intelligent, context-aware responses in Interactive Mode. The model receives an image snapshot and the user's question alongside a carefully crafted system prompt
Text-to-speech: ElevenLabs to convert all text output into natural-sounding audio
Language: Python for all scripting and workflow orchestration
Debugging: Claude Code assisted us in resolving audio and video capture pipeline issues

We began by building and validating the entire pipeline on our laptops, then ported and tested everything on the Raspberry Pi once stable.

Challenges we ran into

The most time-consuming challenge was getting text-to-speech to work reliably after receiving output from YOLO or Claude. The audio pipeline kept failing silently, and debugging it took significant effort to trace and resolve.

We also hit a dependency conflict when trying to install the Ultralytics library on our MacBooks due to a Python version mismatch, which added unexpected friction early in development.

Transitioning from laptop to Raspberry Pi introduced additional environment-level issues.

Accomplishments that we're proud of

We built a fully working prototype in time for the demo. A user can speak to the device and get real, useful responses in all three modes:

In Environment Mode, saying something like "describe my environment" triggers continuous narration of nearby objects and their positions relative to the user
In Interactive Mode, users can ask follow-up questions and receive precise, image-grounded answers from Claude
In Navigation Mode, users can ask for directions to a nearby place, powered by Open Source Routing Machine (OSRM)

Beyond the technical achievement, we're proud of how well we functioned as a team. We divided responsibilities clearly, worked in short sprints with defined goals, and executed efficiently under time pressure. Most of all, it's meaningful to us that what we built could genuinely improve someone's daily life.

What we learned

Technically, we gained hands-on experience with Raspberry Pi, OpenCV, YOLOv8, ElevenLabs, and multimodal LLM APIs, many of these being new tools for us. We learned that working with embedded hardware requires patience and a different debugging mindset than pure software development.

On the team side, we learned that clear task ownership and sprint-based goals make a real difference. Breaking the project into focused milestones helped us make consistent progress and avoid getting stuck.

What's next for Intelux

Smart Glasses Integration: Moving from a hat-mounted camera to a sleek, purpose-built pair of smart glasses for a more natural and discreet form factor
Expanded capabilities: OCR for reading text in the environment, facial recognition for identifying known people, and improved object tracking across frames
Market launch: Our goal is to refine the product and bring it to market within 12 months, making a tangible impact on the lives of visually impaired individuals worldwide