Inspiration
With so much happening around us every day, it can be hard to remember everything we see or do. We wanted to build something that could extend human memory and understanding. This idea inspired ARVS (Augmented Recall and Vision System), smart glasses that can:
- Understand what you see
- Remember your experiences
- Help you recall or analyze moments later
What it does
ARVS acts like a visual assistant that sees and remembers your environment. It uses cameras and AI to:
- Recognize objects
- Read text
- Understand context in real time
You can ask questions like:
- “Where did I leave my wallet?”
- “Can you solve that math problem I saw earlier?”
ARVS will recall the right moment from your past recordings and provide accurate, context-aware answers.
How we built it
We built ARVS using the following technologies and approaches:
- Data storage: Video data in MongoDB
- Processing: Computer vision and OCR models
- Vectorization: Each frame is converted into embeddings for fast semantic search
- AI reasoning: Google Gemini for natural language understanding and contextual reasoning
- Voice feedback: ElevenLabs for realistic speech responses
- Frontend: Interactive interface for voice or text queries
Challenges we ran into
Key challenges included:
- Reducing AI calls while maintaining accuracy
- Efficient storage and retrieval of large video data
- Maintaining low latency for real-time responses
- Ensuring meaningful recall for context-specific queries
We solved these by switching from direct prompt-based AI calls to vectorized embeddings for search and reasoning.
Accomplishments that we're proud of
We successfully created a system that:
- Understands natural language
- Connects AI reasoning with real-world visuals
- Recalls objects and their last-seen locations
Seeing ARVS identify and retrieve visual memories felt like a true step toward memory augmentation.
What we learned
Through this project, we gained experience in:
- Integrating multimodal AI (vision + language)
- Using semantic vector search for fast recall
- Building scalable database systems for high-volume data
- Understanding how AI can enhance human memory and perception
What's next for Augmented Recall & Vision System
Our future plans include:
- Making ARVS fully wearable with on-device, real-time processing
- Integrating AR display overlays
- Enabling continuous memory tracking of surroundings
- Providing instant recall and interaction through vision and voice
Built With
- elevenlabs
- express.js
- faiss
- google-cloud
- javascript
- langchain
- mongodb
- next.js
- node.js
- opencv
- python
- react
- snowflake
- typescript
- vultr

Log in or sign up for Devpost to join the conversation.