Daily Coach — About the Project

Why We Built This

The goal: make voice-based AI coaching actually work. Most AI coaching apps are just chatbots with a different UI. You type, it types back, nothing feels real.

When OpenAI released their Realtime API, the opportunity was clear — sub-300ms voice latency means you can actually have a conversation. No typing, no lag, just talking. We built Daily Coach to prove you could do real coaching sessions entirely by voice.

How We Built It

Tech stack: React Native + Expo for mobile, Node.js + Express + MongoDB for backend. RevenueCat handles subscriptions.

Key decisions:

  • NativeWind for styling — keeps things consistent, handles dark mode cleanly
  • Zustand for UI state, React Query for server data — covers everything without Redux complexity

The voice pipeline:

  1. Client connects to OpenAI via WebRTC
  2. Audio streams peer-to-peer, never touches our servers
  3. Transcripts sync back every 30 seconds + on session end

Context injection — OpenAI's Realtime API doesn't remember anything. Every session starts fresh. We inject the coach's personality, the user's goals, and the last 5 session summaries at the start of each call. That's how coaches "remember" previous conversations.

Voice mapping — 8 coaches, 10 available OpenAI voices. Built a mapping system based on gender and tone. Challenging male coach → "ash" voice. Warm female coach → "shimmer". Fallback to "alloy" if something breaks.

Transcript sync — WebRTC connections drop unexpectedly (network issues, app backgrounding). Built status tracking (active, paused, ended, error, abandoned) with retry logic. If a session dies mid-conversation, the partial transcript still saves.

Real-time UI — Users expect instant feedback. OpenAI sends audio in chunks. We buffer it, queue it, and play it sequentially with expo-audio. Transcripts update via the WebRTC data channel. Keeping audio, transcripts, and UI status in sync without race conditions took multiple iterations.

Subscriptions — RevenueCat abstracts iOS/Android payment differences. We process webhooks for renewals, cancellations, and billing issues. Free tier enforcement happens both client-side (instant feedback) and server-side (prevents API bypass). Webhooks aren't guaranteed, so we also sync on every app launch and after purchases. Triple redundancy.

Onboarding — Anonymous auth via device ID. Users can start a session without signing up. When they eventually create an account, we merge the anonymous data (sessions, transcripts, subscription) into their real account. This required careful sequencing to avoid orphaned data.

What We Learned

Voice latency is everything. Anything over 500ms feels broken. OpenAI averages 250–350ms, which just barely works. Every slow database query adds perceived lag. The entire pipeline has to stay lean.

Context > intelligence. GPT-4 is smart. Injecting past session summaries and specific frameworks (GTD, Eisenhower Matrix, etc.) made the difference between generic advice and useful coaching.

Subscription infrastructure is complex. RevenueCat handles a lot, but not everything. Still needed webhook idempotency, user aliasing, dual-sync fallbacks, free tier enforcement on both sides, and careful state management.

Challenges

OpenAI's Realtime API is unforgiving. Ephemeral tokens expire in 5 minutes (backend has to be fast). No session persistence (context injection required). Audio format is strict — PCM, 24kHz, 16-bit, mono. Anything else fails silently. Error messages are cryptic. Solution: extensive logging, retry logic, and explicit failure mode testing.

Free tier enforcement needs both sides. Client-side gives instant feedback. Server-side prevents API bypass. But keeping counts in sync is tricky if the client's cache is stale. Solution: useSubscription hook refreshes on app launch and after every relevant action. Backend is source of truth, client caches aggressively.

Anonymous-to-authenticated transitions are brittle. When signing up, we merge sessions, alias the RevenueCat user, update the auth store, and preserve history. If any step fails, you get orphaned data or double subscriptions. Added transaction-like sequencing with error rollback.

What's Next

MVP works. Voice sessions, subscriptions, custom coaches, session history, evaluations.

Future: commitment tracking (proactive follow-ups), adaptive programs (branching based on responses), coach marketplace (publish and monetize), group coaching (multi-user sessions), integration hooks (Notion, Slack, fitness apps).

Shipping this version first. Working product beats bloated roadmap.

Share this project:

Updates