Coach AI OS - Gemini 3 Hackathon Submission

Elevator Pitch

The Play Store for autonomous AI coaches. Grounded in expert methodologies, they manage your TODOs and follow-ups via Gemini 3's deep reasoning, semantic memory, and real-time voice mode.


Inspiration

The inspiration for Coach AI OS started at my first startup, Kohbee. Working closely with trainers and mentors, I saw that the right coach doesn't just give advice. They cut through the noise and make you focus on what actually matters.

To bring this into my own life, Aika and I started weekly accountability retrospectives. We'd track what got done, what didn't, what went wrong, and set clear tasks for the next week. Simple stuff borrowed from Product Management, but it worked. Until life got busy and syncing schedules became impossible. We needed an accountability partner that was always available but didn't just passively listen.

I tried every AI tool out there. None of them worked. They'd never follow up on what I said last week, never push back when my plans were unrealistic, never bring any structure. I was managing the AI instead of the other way around.

Then I found creators like Ali Abdaal and Tony Robbins who had research-driven coaching frameworks, proven methodologies that were nuanced and actually effective. I built my first AI coach grounded in one of these frameworks, shared it with friends, and started getting requests. "Can you tweak this for me?" "Can I get a version for health goals?" People who couldn't afford $200-500/hour coaching were getting real value from methodology-grounded AI. That's when it clicked. This isn't a chatbot. It's a Coach OS: an autonomous system that follows proven frameworks to help you get clarity across work, health, family, and wealth.


What it does

Coach AI OS is a platform where creators build AI coaches grounded in their methodologies, and users find the right coach for whatever they're dealing with right now.

On the Explore page, you browse coaches built for specific situations: productivity, career decisions, health habits, content creation, and more. Each coach is grounded in a real methodology through Gemini's File Search API. Creators upload their frameworks, books, and style guides, and the coach retrieves relevant guidance during every conversation. You're not getting generic advice. You're getting answers that follow a proven system.

These coaches don't just talk. They operate in what we call an Autonomous Growth Arc. When you share a goal, the coach breaks it down, saves it to your task list, writes session memos, and follows up next time you check in. It uses Gemini 3's High Thinking mode to reason through your situation before responding, pushing back on unrealistic plans and helping you see patterns you'd miss on your own. Over time, the coach's semantic memory builds a picture of your journey, surfacing insights from weeks ago when they become relevant today.

When typing isn't enough, you switch to real-time voice coaching powered by Gemini's Live API. Same coach, same memory, same methodology, just a conversation.

If none of the existing coaches fit, you can build your own. Upload your methodology documents, define the persona and communication style, test it live, and publish it to the marketplace. Expert coaching that used to cost $200-500/hour becomes accessible to anyone.


How we built it

We built Coach AI OS as a monorepo with a Flutter frontend and a Python backend built on the google-genai SDK and FastAPI.

The agent is the core of everything. We built a custom agent loop running on gemini-3-flash-preview with thinking_level: HIGH, and gave it 15 function tools that let it take real action during a conversation. The agent manages its own tool orchestration, deciding what to call, in what order, based on the conversation. We handle thought signatures across multi-turn sessions to maintain reasoning coherence.

Methodology grounding uses the File Search API directly. When a creator uploads framework documents, we index them into a FileSearchStore. During coaching, the agent calls search_methodology_files() which queries the store through Gemini 3:

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=f"Extract information from the provided files to answer: {query}",
    config={
        "tools": [types.Tool(
            file_search=types.FileSearch(
                file_search_store_names=[file_search_store_name],
            ),
        )],
    },
)

This means coaching advice is grounded in real methodology content, not just system prompt instructions.

Semantic memory uses Gemini text embeddings and Firestore vector search. When the agent saves a memory, we generate an embedding with text-embedding-005 and store it alongside the content. When retrieving relevant memories, we use Firestore's find_nearest with cosine similarity to surface insights from weeks ago that are relevant to today's conversation. This is what makes the coaching relationship feel continuous.

The data model has two layers. Global context like values, preferences, and goals lives in users/{uid} and is shared across all coaches. Coach-specific memories and documents live in user_coaches/{uid}_{coach_id}, scoped to that relationship. This means switching coaches doesn't lose your identity, but each coach builds its own understanding of you.

The frontend is Flutter with Riverpod, targeting iOS and Android. 46 screens covering the marketplace, chat with markdown rendering, an 8-step coach creation wizard, document management (TODOs and memos), voice overlay for Live API streaming, and a RevenueCat paywall. Firebase handles auth and Firestore syncs state between the app and backend.

The backend deploys to Cloud Run as a containerized FastAPI application, with Firebase Admin SDK for auth verification and Google Secret Manager for prompt privacy.

To ship a production-grade full-stack system covering mobile, agentic backend, embeddings, and real-time voice in hackathon time, we ran a multi-agent pair programming workflow. Antigravity and Claude Code acted as technical teammates for scaffolding the Flutter UI and orchestrating complex backend logic. CodeRabbit handled AI-native code reviews powered by Gemini to keep every PR production-ready. And we used Imagen 3 to generate all coaching avatars and empty-state illustrations, maintaining a consistent premium aesthetic across the marketplace without needing a designer.


Challenges we ran into

Making the agent autonomous, not just responsive. The hardest design challenge was moving beyond chat-and-reply. We needed the agent to decide on its own when to save a memory, when to create a goal, when to push back on a plan. Getting this right meant careful iteration on 15 tool descriptions and the system instruction so the LLM understood not just what each tool does, but when to use it.

We built on ADK, then moved off it. We started with Google's Agent Development Kit as our backend framework. As we pushed into production-level features like custom session management, streaming with parallel tool calls, and fine-grained auth control, we kept hitting walls with ADK's still-evolving surface area. Rather than fighting the framework, we made the call to go direct with the google-genai SDK and FastAPI. It gave us full control over the agent loop, tool orchestration, and session handling while still using Gemini 3's native capabilities like thinking levels, thought signatures, and File Search.

Voice mode end-to-end. The Live API uses a different model than our text agent, which meant the voice session needed its own configuration while still sharing the same user context and memory. Coordinating audio streaming in Flutter, WebSocket connections, and voice activity detection with tool execution all had to work without noticeable latency.

Building on APIs newer than the tools that help you build. Gemini 3, ADK, and File Search are all 2026 releases. Our AI coding assistants would confidently generate code using deprecated Gemini 2.0 patterns and SDK methods that no longer exist. We ended up maintaining our own reference docs with current model names, SDK patterns, and known gotchas to keep everything grounded in what actually works.

File Search only works with Gemini API keys, not Vertex AI. We discovered this partway through and had to rearchitect our auth flow to maintain separate paths for File Search versus other services.

Firestore vector indexing. Getting find_nearest working for semantic memory required specific composite vector index configuration. We built a graceful fallback to importance/recency sorting when the index isn't available.


Accomplishments that we're proud of

The biggest win was when the coach started working the way we originally imagined it. It would remember what you talked about last session, save a goal when you mentioned one in passing, and bring up something you committed to doing. None of that is one feature. It's the memory, the tools, and the reasoning all working together. Getting that right took most of our effort and it shows in the experience.

We also shipped a lot more than we expected to. Full marketplace, coach creation flow, chat with streaming, voice mode, semantic memory, document management, and a paywall. 46 screens and 15 backend tools, all connected. It's not a prototype with placeholder screens. Everything works end to end.

The follow-up system is probably the thing we're most proud of practically. The coach tracks your tasks and sends you a notification to check in, not just waiting for you to open the app. That's the feature that makes it feel like coaching instead of chatting.


What we learned

The most surprising insight came from early user conversations. People don't just want a good coach. They want to know who made it. Trust in the coach is directly tied to trust in the creator. You'd naturally pick a productivity coach built by Ali Abdaal over one built by an anonymous account. That fundamentally shaped our product. We added creator profiles, methodology descriptions, and visible grounding sources so users could see exactly who built the coach and what it's based on before starting a conversation.

Grounding LLMs in up-to-date documentation is non-negotiable. Gemini 3, File Search, and the genai SDK were all released this quarter. Nothing online is accurate yet. We learned early to maintain our own reference files with current model names, API patterns, and known issues. Every AI coding tool we used would confidently write code for deprecated APIs. The fix was simple: ground them in the right docs, just like we ground our coaches in the right methodologies.

Consistent visual design without a designer is possible now. We used Nano Banana Pro through Antigravity to generate all coaching avatars and empty-state illustrations. With the right prompting, the outputs stayed visually consistent across the entire marketplace. That's something that would have taken a designer days to produce.

The hardest part of building an agentic system isn't the AI. It's the tool design. The model is smart enough to figure out what to do if you describe your tools well. We spent more time writing clear tool descriptions and return formats than we did on the system prompt. When the tools are well-defined, the agent makes good decisions. When they're vague, it guesses and gets it wrong.


What's next for Coach AI OS

We're currently running a closed beta through TestFlight and using the app ourselves every week. At the same time, we've started reaching out to coaches and creators to gauge their interest in building AI coaches on the platform.

Beyond the product roadmap, what we're really testing is a bigger question. Can AI coaching actually help people be more productive in life, not just at work? Right now it's easier than ever to fall into the trap of doing too many things at once. Juggling side projects, optimizing your health, managing finances, picking up new skills, all while trying to stay focused on what actually matters. The whole reason coaching exists is to help you cut through that noise. We want to see if making that accessible to everyone changes how people work and how they live.

Built With

  • antigravity
  • claude-code
  • coderabbit
  • dart
  • fastapi
  • file-search-api
  • firebase-auth
  • firebase-firestore
  • firestore-vector-search
  • flutter
  • gemini-3-flash
  • gemini-3-pro
  • gemini-live-api
  • gemini3
  • google-cloud-run
  • google-genai-sdk
  • google-secret-manager
  • python
  • revenuecat-sdk
  • riverpod
  • text-embedding-api
Share this project:

Updates