Inspiration

Our journey began with a simple observation: the world is designed for the sighted, leaving the visually impaired to navigate a labyrinth of obstacles every day. We were inspired by the resilience of the visually impaired community and motivated by a single question: How can we use generative AI to turn "seeing" into "hearing"? We wanted to create more than just a tool; we wanted to build a seamless bridge to independence.

What it does

ReNex acts as an intelligent set of eyes. Using smart glasses or a mobile interface, it captures the user’s surroundings and provides real-time audio feedback. It doesn't just name objects; it describes scenes, reads text (OCR) in multiple languages, identifies currency, and even recognizes faces. Beyond recognition, it integrates with navigation services to provide hands-free, voice-controlled guidance and safety alerts when obstacles are in close proximity.

How we built it

We developed a hybrid architecture to balance power and portability:

The Brain: We integrated Gemini 2.0 Flash for its native multimodality, allowing the system to reason through visual data and user queries simultaneously.

The Backend: A Python/Flask server handles the heavy lifting of API orchestration and data processing.

The Edge: We utilized Raspberry Pi to ensure low-latency AI computations, which is critical for real-time safety.

The Feedback Loop: Audio is delivered via Google Speech-to-Text and gTTS, while tactile vibrations provide a secondary safety layer.

Challenges we ran into

One of our biggest hurdles was latency. In a navigation scenario, a 2-second delay in detecting an obstacle is the difference between safety and an accident. We solved this by implementing Edge Computing on Raspberry Pi and choosing the high-speed Gemini 2.0 Flash model. Another challenge was ensuring functionality in areas with poor internet; we addressed this by developing a "model switching" feature that can toggle to offline custom-trained models when needed.

Accomplishments that we're proud of

Successful Integration: Seamlessly combining LLMs (Gemini) with traditional computer vision (TensorFlow/PyTorch) for a holistic understanding of the environment.

Offline Capability: Creating a system that remains affordable and functional without constant cloud reliance.

Impact-Driven Design: Developing a solution that received positive feedback for its "No complex setup" approach, making it wearable and user-friendly for all ages.

What we learned

Building ReNex taught us that AI is most powerful when it is accessible. We learned the intricacies of RESTful API development and the importance of multimodal feedback (audio + tactile) in UX design for the visually impaired. Most importantly, we learned how to optimize large models for real-world, low-power hardware constraints.

What's next for Z3GION

The next phase for our team involves expanding our "spatial awareness" models to detect more complex environmental hazards. We plan to incorporate tactile haptic maps on wearable bands and expand our multilingual support to include more local dialects. Our goal is to scale ReNex from a prototype into a globally accessible platform for all types of visual assistance.

Built With

Share this project:

Updates