Augmented Recall & Vision System

Backup your memory and never forget what you see

Comment

Inspiration

With so much happening around us every day, it can be hard to remember everything we see or do. We wanted to build something that could extend human memory and understanding. This idea inspired ARVS (Augmented Recall and Vision System), smart glasses that can:

Understand what you see
Remember your experiences
Help you recall or analyze moments later

What it does

ARVS acts like a visual assistant that sees and remembers your environment. It uses cameras and AI to:

Recognize objects
Read text
Understand context in real time

You can ask questions like:

“Where did I leave my wallet?”
“Can you solve that math problem I saw earlier?”

ARVS will recall the right moment from your past recordings and provide accurate, context-aware answers.

How we built it

We built ARVS using the following technologies and approaches:

Data storage: Video data in MongoDB
Processing: Computer vision and OCR models
Vectorization: Each frame is converted into embeddings for fast semantic search
AI reasoning: Google Gemini for natural language understanding and contextual reasoning
Voice feedback: ElevenLabs for realistic speech responses
Frontend: Interactive interface for voice or text queries

Challenges we ran into

Key challenges included:

Reducing AI calls while maintaining accuracy
Efficient storage and retrieval of large video data
Maintaining low latency for real-time responses
Ensuring meaningful recall for context-specific queries

We solved these by switching from direct prompt-based AI calls to vectorized embeddings for search and reasoning.

Accomplishments that we're proud of

We successfully created a system that:

Understands natural language
Connects AI reasoning with real-world visuals
Recalls objects and their last-seen locations

Seeing ARVS identify and retrieve visual memories felt like a true step toward memory augmentation.

What we learned

Through this project, we gained experience in:

Integrating multimodal AI (vision + language)
Using semantic vector search for fast recall
Building scalable database systems for high-volume data
Understanding how AI can enhance human memory and perception

What's next for Augmented Recall & Vision System

Our future plans include:

Making ARVS fully wearable with on-device, real-time processing
Integrating AR display overlays
Enabling continuous memory tracking of surroundings
Providing instant recall and interaction through vision and voice

Built With

elevenlabs
express.js
faiss
google-cloud
javascript
langchain
mongodb
next.js
node.js
opencv
python
react
snowflake
typescript
vultr

Updates

Vedaant Mohta started this project — Nov 08, 2025 11:58 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.