🌟 Inspiration

Visual impairment affects 2.2 billion people around the world, often resulting in struggles with mobility, access to education, and a negative impact on mental health. We strive to bridge the gap by creating an intelligent personal assistant capable of visual recognition, object memory, and contextual understanding.

The idea we had for the app was something capable of serving as the eyes of visually impaired users—reminding them when and where certain items are, as well as alerting them to any concerning details.


🧠 What It Does

Foresight is a personal assistant app designed to help users remember and interact with their surroundings.

By continuously capturing visual data and analyzing it in real-time, it allows users to ask questions like:

  • “Where did I leave my keys?”
  • “What did I do last week?”

Foresight leverages the power of AI and computer vision to provide contextual memory that users can rely on in their daily lives.


🛠️ How We Built It

  • Backend: Built with Python, managing data processing, interaction with Gemini, and integration of visual context.
  • AI Integration: Implemented Gemini to help analyze real-world data, enabling accurate, context-driven responses.
  • Database: Used MongoDB to store chat history, visual data, and user information.
  • API: Implemented with FastAPI for real-time communication between the app and server.
  • Frontend: Developed using Next.js for a dynamic, responsive interface.
  • Text-to-Speech: Integrated ElevenLabs technology to provide a human-like assistant voice.

🚧 Challenges We Ran Into

One of the major challenges was ensuring real-time processing of the live video feed while maintaining app performance.

We had to:

  • Manage a continuous stream of image data.
  • Optimize visual recognition for speed and accuracy.

Through data consolidation and backend optimizations, we were able to meet these challenges.


📚 What We Learned

Building Foresight taught us how to:

  • Seamlessly integrate multiple cutting-edge technologies.
  • Fine-tune AI-driven visual context using Gemini.
  • Efficiently manage real-time data streaming.
  • Balance performance and scalability.

🚀 What's Next for Foresight

We aim to:

  • Increase accessibility with voice-only navigation
  • Real-time alerts for danger and time-sensitive information
  • Enhance visual recognition accuracy for more nuanced environments.
  • Introduce personalized features, like customizable memory settings.
  • Add predictive suggestions based on user behavior.

Built With

Share this project:

Updates