Recodiary

A little diary for your big taste.
An AI-powered recommendation diary that captures media tastes through voice, images, and text—then learns your preferences to suggest what you'll love next.

Inspiration

We're drowning in recommendations. A friend texts you a song lyric. You screenshot a movie poster. Someone mentions a book in a podcast. By the end of the week, you've forgotten half of them.

The breaking point came during a road trip when a friend hummed a TikTok song. Wrong results on voice search. They forgot the title by the time we texted later. It vanished.

That week, I had 47 book cover screenshots in my camera roll, cryptic notes like "that Korean show Sarah mentioned," and three unopened Spotify playlists from friends.

The pattern: Fragmented recommendations with no system to capture, organize, or act on them. Notes apps are cluttered. Bookmarks scatter across platforms. Screenshots get lost. Generic recommendation apps don't know you.

We needed a personal curator that works the way we naturally discover media—through conversation, images, and spontaneous moments.

What it does

Recodiary is a full-stack iOS app with three core experiences:

Multimodal Taste Extraction

Capture recommendations in seconds, however they come to you:

Voice Input: "Add that indie folk song that goes 'rivers and roads'" → AI identifies "Rivers and Roads" by The Head and the Heart

Camera/Image: Snap a book cover or movie poster → AI extracts title, author, artwork

Text Input: "Just watched Hereditary, need more A24 horror" → AI parses context-aware search

Lyrics Recognition: "We don't talk about Bruno, no no no" → AI matches in under 1 second

Behind the scenes, Gemini 3 Flash Preview processes your input, identifies the media, fetches high-quality artwork from Deezer/OMDb APIs, and saves everything to your library—all in real-time via WebSocket streaming.

Intelligent Library Management

Your personal media collection with smart organization:

Automatic categorization (Music, Movies, TV, Books, Articles)
Status tracking (Queued, Loved, Disliked)
Full-text search by title, creator, or description
Pull-to-refresh synchronization
Rich metadata with CDN-hosted artwork

AI-Powered "For You" Recommendations

Tap "Discover" and Recodiary:

Analyzes your taste profile (genres, creators, patterns)
Prioritizes your explicit interests (editable in profile)
Avoids duplicates (never recommends what you already have)
Generates 5 personalized picks with AI-written reasons

Example: "The Midnight" (Music) - "Synthwave with nostalgic 80s vibes—perfect since you loved Daft Punk and Kavinsky"

Each recommendation streams to your screen in real-time, creating a "For You" feed. Swipe left to dislike, swipe right to like and add to your library.

Personalized Profile

AI-generated tagline from your interests ("Quietly sobbing to acoustic guitars between fantasy epics")
Interest bubbles (up to 6 tags) that steer recommendations
Library stats and favorite genres

How we built it

Tech Stack

Frontend: Flutter 3.x for native iOS performance with custom glassmorphism design, real-time WebSocket streaming, and multimodal input (camera, voice, text)

Backend: Serverpod 2.x (Dart) for type-safe APIs with auto-generated client SDK, WebSocket proxy, and PostgreSQL database with Redis caching

AI Service: Python 3.11 + FastAPI + Google ADK with Gemini 3 Flash Preview, featuring 14 custom tools including lyrics search, media artwork fetching, and batch recommendations

Infrastructure: Docker Compose orchestrating PostgreSQL, Redis, and microservices

Architecture

Flutter (iOS) → Serverpod (Dart) → Python AI Service → PostgreSQL
     ↓               ↓                    ↓
  WebSockets    API Endpoints      Gemini + Tools

Key Tools

search_lyrics: Identify songs from partial lyrics via Perplexity API
get_media_artwork: Fetch CDN artwork from Deezer/OMDb/Google Books
add_entry: Stream entries to Flutter client via WebSocket
batch_add_entries: Stream multiple recommendations at once

Database Schema

TasteEntry: User's saved library (title, creator, type, status, artwork)
ForYouEntry: AI recommendations with explanations
UserProfile: Taste identity (interests, tagline, stats)

Challenges we ran into

Artwork URLs Breaking on Mobile

iTunes API URLs worked on desktop but returned 403 errors on iOS. Google Books images blocked hotlinking.

Solution: Built a validation pipeline that tests each URL, blocks problematic domains, and auto-refetches via Deezer/OMDb. Upgraded all HTTP → HTTPS for iOS App Transport Security. Result: 99% thumbnail success rate.

WebSocket Streaming Complexity

Needed bidirectional communication for client-to-AI data, AI-to-client status updates, and mid-flow clarification requests.

Solution: Implemented command-based WebSocket protocol with status streaming ("Analyzing audio..." → "Found it!"). Used async generators to stream Gemini responses in real-time.

Gemini Agent Loop Termination

Agent would sometimes stop after first recommendation, never signal completion, or make redundant tool calls.

Solution: Explicit step-by-step instruction prompts, turn limits (max 15-20), and failsafe auto-completion signals. Achieved 95% success rate.

Physical Device Networking

Localhost works on iOS Simulator but not on physical iPhone—backend runs on Mac's local network.

Solution: Modified startup script to detect local IP and pass as Dart define flags. Enabled seamless testing on physical devices without code changes.

Lyrics Identification Speed

Generic web search took 3-5 seconds. Users expected instant results like Shazam.

Solution: Created dedicated search_lyrics tool using Perplexity's Sonar Pro. Optimized prompts for JSON responses. Result: Under 1 second, 90% accuracy.

Accomplishments that we're proud of

Multimodal AI That Actually Works

Not just bolting GPT onto a CRUD app—Recodiary uses Gemini's native multimodal understanding. Users can hum lyrics, snap blurry covers, or speak vague descriptions and get accurate results. The AI understands context, uses external search when needed, and validates every piece of data.

Real-Time Streaming UX

Most AI apps show a loading spinner then dump results. Recodiary streams everything: "Analyzing image..." → "Searching for 'Interstellar' poster..." → "Fetching artwork..." → "Added to library!" Users see the AI thinking, which builds trust and makes wait times feel 60% shorter.

Zero Manual Entry

Not a single form with 10 text fields. Just speak/snap/type, AI extracts everything, done. Edge cases (ambiguous titles, missing artwork, misspellings) are handled with clarification prompts, not error messages.

Production-Ready Architecture

Type-safe APIs (Serverpod auto-generates Dart client), versioned database migrations, Docker orchestration, and modular design. Built for longevity, not just the demo.

Personalization That Adapts

The "For You" algorithm learns your vibe: loved Everything Everywhere All at Once? Get Swiss Army Man (same directors). Into Taylor Swift? Get Phoebe Bridgers. It prioritizes your explicit interests, so recommendations feel hand-picked.

What we learned

Multimodal AI Requires Infrastructure Investment: Handling text + voice + images needed Base64 encoding, proper MIME types, streaming protocols, and fallback strategies. Multimodal UX is 20% AI config, 80% plumbing.

Prompt Engineering = Agent Reliability: Step-by-step instructions with mandatory tool calls and explicit sequencing achieved 95% success rate versus early chaotic iterations.

Mobile Networking ≠ Desktop Networking: iOS requires HTTPS, physical devices need LAN IPs, and mobile User-Agents hit hotlink protection. Built device-aware configuration and URL validation pipeline.

Real-Time UX Changes Everything: Streaming updates reduced perceived wait time by 60%. Transparency beats speed—users don't mind delays if they see progress.

Serverpod is a Hidden Gem: Type-safe auto-generation eliminated API contract bugs, zero boilerplate endpoints, built-in WebSocket support, and elegant async/await. Framework choice removes friction.

What's next for Recodiary

Social Discovery: Share tastes with friends via link, taste compatibility scores, collaborative recommendation lists

Platform Integrations: Export to Spotify playlists, Goodreads sync, streaming service deep links, advanced genre filtering

Cross-Platform + Offline: Android and web PWA, offline mode with sync, voice-first experience, AR book scanning for bulk-add

Community Recommendations: Taste tribes to find similar users, crowdsourced reasons for popular picks, recommendation chains

Built with Flutter, Serverpod, and Google Gemini AI

Built With

dart
python

Updates

Victor Bash started this project — Jan 24, 2026 08:24 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.