Foresight

🌟 Inspiration

Visual impairment affects 2.2 billion people around the world, often resulting in struggles with mobility, access to education, and a negative impact on mental health. We strive to bridge the gap by creating an intelligent personal assistant capable of visual recognition, object memory, and contextual understanding.

The idea we had for the app was something capable of serving as the eyes of visually impaired users—reminding them when and where certain items are, as well as alerting them to any concerning details.

🧠 What It Does

Foresight is a personal assistant app designed to help users remember and interact with their surroundings.

By continuously capturing visual data and analyzing it in real-time, it allows users to ask questions like:

“Where did I leave my keys?”
“What did I do last week?”

Foresight leverages the power of AI and computer vision to provide contextual memory that users can rely on in their daily lives.

🛠️ How We Built It

Backend: Built with Python, managing data processing, interaction with Gemini, and integration of visual context.
AI Integration: Implemented Gemini to help analyze real-world data, enabling accurate, context-driven responses.
Database: Used MongoDB to store chat history, visual data, and user information.
API: Implemented with FastAPI for real-time communication between the app and server.
Frontend: Developed using Next.js for a dynamic, responsive interface.
Text-to-Speech: Integrated ElevenLabs technology to provide a human-like assistant voice.

🚧 Challenges We Ran Into

One of the major challenges was ensuring real-time processing of the live video feed while maintaining app performance.

We had to:

Manage a continuous stream of image data.
Optimize visual recognition for speed and accuracy.

Through data consolidation and backend optimizations, we were able to meet these challenges.

📚 What We Learned

Building Foresight taught us how to:

Seamlessly integrate multiple cutting-edge technologies.
Fine-tune AI-driven visual context using Gemini.
Efficiently manage real-time data streaming.
Balance performance and scalability.

🚀 What's Next for Foresight

We aim to:

Increase accessibility with voice-only navigation
Real-time alerts for danger and time-sensitive information
Enhance visual recognition accuracy for more nuanced environments.
Introduce personalized features, like customizable memory settings.
Add predictive suggestions based on user behavior.

Built With

atlas
elevenlabs
fastapi
gemini
mongodb
next.js
onrender
python
spacy

Submitted to

SFHacks 2025

Created by

This was my first hackathon project. I worked on a large portion of the backend and handled the core integration between Python, FastAPI, and Gemini.

Anson Tan
Full Stack Dev @ UCD
This is my first hackathon. I worked on the frontend, data handling, and backend communications.

Andrey Neudachin
I'm a second-time hacker that worked with on some backend, our TTS, implementing a spaCy model for word vector comparisons, and implementing database pruning capability.

Dmitriy Gamolya