Inspiration

Every family has stories that only exist in someone's memory — a grandmother's wedding day, a father's first job, a summer afternoon that changed everything. But most people never write them down, and when that person is gone, the stories go with them.

Famoir was born from a simple idea: what if preserving a family memory was as easy as having a conversation? No writing required — just talk, and AI handles the rest.

What it does

Famoir is an AI-powered family memoir platform. Users upload a meaningful photo, then have a real-time voice conversation with an AI interviewer that draws out the story behind the image. The system then transforms that conversation into a beautifully written memoir chapter — ready to read, share, and keep as a family book.

How we built it

Famoir runs on 5 AI agents orchestrated through Google's Agent Development Kit (ADK):

  1. PreSessionPipeline (SequentialAgent) — chains a Photo Analyst (Gemini 2.5 Flash Vision) that understands the uploaded photo with a Context Builder that formats those insights for the interview.

  2. Interviewer (LlmAgent) — connects through Gemini Live API for real-time bidirectional voice streaming. It reads the photo context and adapts its questions as the conversation unfolds — picking up on emotional cues and asking the right follow-ups.

  3. PostSessionPipeline (SequentialAgent → LoopAgent) — a quality control loop that cycles between a Narrator (transforms conversation into literary prose), a Quality Checker (evaluates narrative quality), and an Escalation Checker (pass/fail gate). Up to 2 automatic revision iterations, no human intervention needed.

The frontend is built with React + TypeScript + Vite, connected to the backend via WebSocket for real-time audio streaming. Everything is deployed as a single container on Google Cloud Run, with Cloud Firestore for persistence and Firebase Auth (Phone SMS OTP) for security.

Challenges we faced

  • Gemini Live API latency — real-time bidirectional voice streaming requires careful buffer management. We built a custom Ring Buffer AudioWorklet to replace the default BufferSourceNode, dramatically reducing audio latency and eliminating subtitle flickering.
  • Interrupt handling — when a user interrupts the AI mid-sentence, in-flight audio chunks still arrive. We implemented a suppressAudio flag to discard post-interrupt audio cleanly.
  • Quality control loop — getting the Narrator to produce consistently literary-quality output required careful prompt engineering and structured output schemas (Pydantic), plus the LoopAgent pattern for automatic revision.
  • Single-container deployment — serving both the FastAPI backend and React frontend from one Cloud Run container required a custom build pipeline, but simplified deployment significantly.

What we learned

  • The LoopAgent pattern in ADK is incredibly powerful for quality control — having agents review and revise each other's work produces dramatically better output than single-pass generation.
  • Gemini Live API enables a completely new interaction paradigm — voice-first AI that feels like talking to a real person, not typing into a chatbox.
  • The hardest part of building a memoir tool isn't the AI — it's designing an experience gentle enough that an 82-year-old grandmother would feel comfortable using it. ```

Built With

  • cloud-firestore
  • fastapi
  • firebase-auth
  • gemini-2.5-flash
  • gemini-live-api
  • google-agent-development-kit-(adk)
  • google-cloud-run
  • python
  • react
  • shadcn/ui
  • tailwind-css
  • typescript
  • vite
  • websocket
Share this project:

Updates