Inspiration
Every photographer knows the pain, you come back from a wedding or event with 2,000 photos and spend hours manually deleting blurry shots, blinking faces, and near-identical burst duplicates. We wanted to build something that automates this entire workflow — privately, without uploading your photos to some cloud service you don't trust. Curator AI runs the heavy AI lifting right in your browser.
What it does
Curator AI is an intelligent photo curation platform. Upload an album and it automatically sorts every image into four categories: keepers (sharp, eyes open), duplicates (near-identical burst frames), blurry shots, and rejected blink captures. Beyond sorting, you can search your album using natural language — type "person with a hat" or "dog on couch" and it finds matching images. It detects objects across 200+ categories with bounding boxes, and generates AI captions for each photo. Everything is exportable as a ZIP with one click. We also integrated Novus.ai to track user interactions across the app — which features get used most, where users spend time, and how they move through the curation workflow.
How we built it
The frontend is React + Vite + TypeScript + Tailwind CSS. The AI pipeline runs entirely client-side using WebAssembly: TensorFlow.js for blur detection (CNN regressor + Laplacian variance), MediaPipe Face Landmarker for blink detection via Eye Aspect Ratio, and DINOv2 for image embeddings to group burst duplicates by cosine similarity. For natural language search, we use Grounding DINO-tiny through HuggingFace Transformers.js. Object detection and captioning are handled server-side through Roboflow's serverless API (YOLO-World and Florence-2), with a FastAPI backend on Hugging Face Spaces for session storage. The ZIP export is built entirely in the browser using JSZip. Novus.ai was installed early in development to track every meaningful interaction — button clicks, feature usage, time per tab — so we could iterate on the product with real data rather than assumptions.
Challenges we ran into
Getting five different AI models to initialize reliably in the browser was the biggest hurdle — each has different WASM loading behavior and memory constraints. We added timeout guards and lazy initialization so the app doesn't freeze while models load. Parsing Roboflow's YOLO-World response format was also tricky since the API returns different JSON structures depending on the workflow version. We built a recursive parser that handles all variants. Another challenge was keeping the pipeline robust when individual images fail to load — the system gracefully skips timeouts and still produces results for the rest of the album. Integrating Novus.ai taught us to think carefully about what events actually matter to track versus noise — we ended up with clean, focused analytics that directly shaped our UI decisions.
Accomplishments that we're proud of
The entire blur-blink-duplicate pipeline runs client-side with zero server RAM, meaning photos never leave the user's machine. We got Grounding DINO working in-browser for zero-shot natural language search — you can search for objects that were never explicitly tagged. The four-category curation system is accurate enough to turn a 2,000-photo album into ~50 keepers with one click. And the UI came together cleanly with smooth transitions and a polished, professional feel. With Novus.ai in place, we can actually measure that — seeing real users land on the app, run their first analysis, and come back for more. That data turns Curator AI from a hackathon demo into something we can genuinely iterate on.
What we learned
We learned that browser-side AI is genuinely viable for real workflows — TensorFlow.js and Transformers.js have matured significantly. The key design principle is graceful degradation: every model has a fallback path, and the pipeline never blocks on a single failure. We also learned that photographers care deeply about privacy, so the "photos never leave your machine" story resonates more than we expected. Novus analytics showed us, for example, that users spend the most time in the Gallery tab and that natural language search is the feature that surprises people — data we wouldn't have guessed without tracking.
What's next for CuratorAI
We want to add batch caption generation for social media (Instagram-ready captions with hashtags), a side-by-side comparison view for burst groups so photographers can manually pick the best frame, support for RAW/HEIC formats, and integration with Lightroom for direct import. We're also exploring on-device model fine-tuning so the blur detector learns from individual photographer preferences over time. With Novus.ai tracking user behavior from day one, every future feature decision will be backed by data rather than guesswork.
Built With
- fastapi
- git
- github
- jszip
- mediapipe
- novus.ai
- python
- react
- tailwind-css
- tensorflow.js
- transformers.js
- typescript
- uvicorn
- vite
Log in or sign up for Devpost to join the conversation.