Recodiary
A little diary for your big taste.
An AI-powered recommendation diary that captures media tastes through voice, images, and text—then learns your preferences to suggest what you'll love next.
Inspiration
We're drowning in recommendations. A friend texts you a song lyric. You screenshot a movie poster. Someone mentions a book in a podcast. By the end of the week, you've forgotten half of them.
The breaking point came during a road trip when a friend hummed a TikTok song. Wrong results on voice search. They forgot the title by the time we texted later. It vanished.
That week, I had 47 book cover screenshots in my camera roll, cryptic notes like "that Korean show Sarah mentioned," and three unopened Spotify playlists from friends.
The pattern: Fragmented recommendations with no system to capture, organize, or act on them. Notes apps are cluttered. Bookmarks scatter across platforms. Screenshots get lost. Generic recommendation apps don't know you.
We needed a personal curator that works the way we naturally discover media—through conversation, images, and spontaneous moments.
What it does
Recodiary is a full-stack iOS app with three core experiences:
Multimodal Taste Extraction
Capture recommendations in seconds, however they come to you:
Voice Input: "Add that indie folk song that goes 'rivers and roads'" → AI identifies "Rivers and Roads" by The Head and the Heart
Camera/Image: Snap a book cover or movie poster → AI extracts title, author, artwork
Text Input: "Just watched Hereditary, need more A24 horror" → AI parses context-aware search
Lyrics Recognition: "We don't talk about Bruno, no no no" → AI matches in under 1 second
Behind the scenes, Gemini 3 Flash Preview processes your input, identifies the media, fetches high-quality artwork from Deezer/OMDb APIs, and saves everything to your library—all in real-time via WebSocket streaming.
Intelligent Library Management
Your personal media collection with smart organization:
- Automatic categorization (Music, Movies, TV, Books, Articles)
- Status tracking (Queued, Loved, Disliked)
- Full-text search by title, creator, or description
- Pull-to-refresh synchronization
- Rich metadata with CDN-hosted artwork
AI-Powered "For You" Recommendations
Tap "Discover" and Recodiary:
- Analyzes your taste profile (genres, creators, patterns)
- Prioritizes your explicit interests (editable in profile)
- Avoids duplicates (never recommends what you already have)
- Generates 5 personalized picks with AI-written reasons
Example: "The Midnight" (Music) - "Synthwave with nostalgic 80s vibes—perfect since you loved Daft Punk and Kavinsky"
Each recommendation streams to your screen in real-time, creating a "For You" feed. Swipe left to dislike, swipe right to like and add to your library.
Personalized Profile
- AI-generated tagline from your interests ("Quietly sobbing to acoustic guitars between fantasy epics")
- Interest bubbles (up to 6 tags) that steer recommendations
- Library stats and favorite genres
How we built it
Tech Stack
Frontend: Flutter 3.x for native iOS performance with custom glassmorphism design, real-time WebSocket streaming, and multimodal input (camera, voice, text)
Backend: Serverpod 2.x (Dart) for type-safe APIs with auto-generated client SDK, WebSocket proxy, and PostgreSQL database with Redis caching
AI Service: Python 3.11 + FastAPI + Google ADK with Gemini 3 Flash Preview, featuring 14 custom tools including lyrics search, media artwork fetching, and batch recommendations
Infrastructure: Docker Compose orchestrating PostgreSQL, Redis, and microservices
Architecture
Flutter (iOS) → Serverpod (Dart) → Python AI Service → PostgreSQL
↓ ↓ ↓
WebSockets API Endpoints Gemini + Tools
Key Tools
search_lyrics: Identify songs from partial lyrics via Perplexity API
get_media_artwork: Fetch CDN artwork from Deezer/OMDb/Google Books
add_entry: Stream entries to Flutter client via WebSocket
batch_add_entries: Stream multiple recommendations at once
Database Schema
- TasteEntry: User's saved library (title, creator, type, status, artwork)
- ForYouEntry: AI recommendations with explanations
- UserProfile: Taste identity (interests, tagline, stats)
Challenges we ran into
Artwork URLs Breaking on Mobile
iTunes API URLs worked on desktop but returned 403 errors on iOS. Google Books images blocked hotlinking.
Solution: Built a validation pipeline that tests each URL, blocks problematic domains, and auto-refetches via Deezer/OMDb. Upgraded all HTTP → HTTPS for iOS App Transport Security. Result: 99% thumbnail success rate.
WebSocket Streaming Complexity
Needed bidirectional communication for client-to-AI data, AI-to-client status updates, and mid-flow clarification requests.
Solution: Implemented command-based WebSocket protocol with status streaming ("Analyzing audio..." → "Found it!"). Used async generators to stream Gemini responses in real-time.
Gemini Agent Loop Termination
Agent would sometimes stop after first recommendation, never signal completion, or make redundant tool calls.
Solution: Explicit step-by-step instruction prompts, turn limits (max 15-20), and failsafe auto-completion signals. Achieved 95% success rate.
Physical Device Networking
Localhost works on iOS Simulator but not on physical iPhone—backend runs on Mac's local network.
Solution: Modified startup script to detect local IP and pass as Dart define flags. Enabled seamless testing on physical devices without code changes.
Lyrics Identification Speed
Generic web search took 3-5 seconds. Users expected instant results like Shazam.
Solution: Created dedicated search_lyrics tool using Perplexity's Sonar Pro. Optimized prompts for JSON responses. Result: Under 1 second, 90% accuracy.
Accomplishments that we're proud of
Multimodal AI That Actually Works
Not just bolting GPT onto a CRUD app—Recodiary uses Gemini's native multimodal understanding. Users can hum lyrics, snap blurry covers, or speak vague descriptions and get accurate results. The AI understands context, uses external search when needed, and validates every piece of data.
Real-Time Streaming UX
Most AI apps show a loading spinner then dump results. Recodiary streams everything: "Analyzing image..." → "Searching for 'Interstellar' poster..." → "Fetching artwork..." → "Added to library!" Users see the AI thinking, which builds trust and makes wait times feel 60% shorter.
Zero Manual Entry
Not a single form with 10 text fields. Just speak/snap/type, AI extracts everything, done. Edge cases (ambiguous titles, missing artwork, misspellings) are handled with clarification prompts, not error messages.
Production-Ready Architecture
Type-safe APIs (Serverpod auto-generates Dart client), versioned database migrations, Docker orchestration, and modular design. Built for longevity, not just the demo.
Personalization That Adapts
The "For You" algorithm learns your vibe: loved Everything Everywhere All at Once? Get Swiss Army Man (same directors). Into Taylor Swift? Get Phoebe Bridgers. It prioritizes your explicit interests, so recommendations feel hand-picked.
What we learned
Multimodal AI Requires Infrastructure Investment: Handling text + voice + images needed Base64 encoding, proper MIME types, streaming protocols, and fallback strategies. Multimodal UX is 20% AI config, 80% plumbing.
Prompt Engineering = Agent Reliability: Step-by-step instructions with mandatory tool calls and explicit sequencing achieved 95% success rate versus early chaotic iterations.
Mobile Networking ≠ Desktop Networking: iOS requires HTTPS, physical devices need LAN IPs, and mobile User-Agents hit hotlink protection. Built device-aware configuration and URL validation pipeline.
Real-Time UX Changes Everything: Streaming updates reduced perceived wait time by 60%. Transparency beats speed—users don't mind delays if they see progress.
Serverpod is a Hidden Gem: Type-safe auto-generation eliminated API contract bugs, zero boilerplate endpoints, built-in WebSocket support, and elegant async/await. Framework choice removes friction.
What's next for Recodiary
Social Discovery: Share tastes with friends via link, taste compatibility scores, collaborative recommendation lists
Platform Integrations: Export to Spotify playlists, Goodreads sync, streaming service deep links, advanced genre filtering
Cross-Platform + Offline: Android and web PWA, offline mode with sync, voice-first experience, AR book scanning for bulk-add
Community Recommendations: Taste tribes to find similar users, crowdsourced reasons for popular picks, recommendation chains
Built with Flutter, Serverpod, and Google Gemini AI
Built With
- dart
- python


Log in or sign up for Devpost to join the conversation.