Inspiration
Modern camera rolls have thousands of photos but almost no meaningful organization. Finding a specific memory usually means endless scrolling. We wanted to make photo libraries searchable like the web, so you can find moments using natural language instead of manually browsing.
What it does
FotoFindr turns your camera roll into a fully searchable AI memory engine. Instead of scrolling, you can type queries like:
- “Happy dog at the park”
- “Pictures of the beach”
Our system analyzes every uploaded photo using computer vision to detect objects, faces, emotions, and scene context. The results are indexed and searchable using semantic AI search.
Users can also:
- Detect low-value images
- Generate AI narration describing photos and memories
How we built it
We built FotoFindr as a full-stack AI system.
Frontend
- React Native with Expo for a fast mobile interface
Backend
- FastAPI server handling uploads, metadata, and search
Vision & AI Pipeline
- YOLO for object detection
- DeepFace for face detection and emotion analysis
- Gemini for semantic caption generation
Search System
- Photos are converted into embeddings and stored in Snowflake
- Natural language queries are embedded and matched using vector similarity search
Voice Narration
- ElevenLabs generates spoken descriptions of photos and search results
Cleanup Photos
- Detect photos that do not have a clear object that may be taken by accident will be marked for deletion
Together, this pipeline converts raw images into a structured, searchable knowledge graph of personal memories.
Challenges we ran into
One major challenge was combining multiple AI models into a fast pipeline. Image captioning, object detection, face analysis, and embedding generation all needed to run efficiently without slowing down the user experience.
Another challenge was semantic search accuracy. Translating natural language queries into meaningful filters required careful structuring of metadata and embeddings so results actually matched user intent.
We also had to balance powerful AI features with hackathon time constraints, focusing on the most impactful capabilities.
Accomplishments that we're proud of
We're proud that we built an end-to-end AI search engine for personal photos in a short time.
- Automatically analyzing photos with multiple computer vision models
- Building a working semantic search system
- Generating AI voice narration of memories
Most importantly, the experience feels magical: you can type a sentence and instantly rediscover moments from your camera roll.
What we learned
This project taught us a lot about building AI-first applications. We learned how to:
- Combine multiple computer vision models into a unified pipeline
- Implement vector search for natural language queries
- Design systems that transform unstructured data (images) into searchable knowledge
- Build fast prototypes that still feel like real products
It also showed us how powerful AI becomes when it’s applied to personal data and memories.
What's next for FotoFindr
Future features could include:
- Smart photo cleanup suggestions
- Timeline-based storytelling of events
- Automatic highlight reels of trips and milestones
- Privacy-first on-device processing
- Integration with existing photo libraries Our goal is to make personal photos not just stored, but truly understood and searchable.
Built With
- deepface
- elevenlabs
- expo.io
- fastapi
- gemini
- python
- react-native
- snowflake
- typescript
- yolo
Log in or sign up for Devpost to join the conversation.