Inspiration

Sailing is one of the most complex sports to master. Unlike other activities where you can pause and think, sailors must make split-second decisions based on constantly changing conditions—wind shifts, gusts, waves, and boat performance. Professional racing teams have dedicated tacticians and coaches, but recreational sailors and amateur racers often sail alone or with inexperienced crew, missing critical opportunities to improve their speed and technique.

We asked ourselves: what if every sailor could have an expert coach on board, one that understands the boat's performance data in real-time and provides actionable guidance exactly when needed?

What it does

AI Sailing Coach is an iOS app that provides real-time sailing guidance through two complementary AI coaches:

Voice Coach (Gemini Live API): A conversational AI coach that listens and speaks naturally. Sailors can ask questions hands-free while keeping their eyes on the water: "Should I tack now?" or "What's my VMG looking like?" The coach receives live boat data every 20 seconds and proactively offers advice when conditions change.

Visual Coach (Gemini 3 Flash Preview): Four always-visible instruction panes that update every 10 seconds with:

  • Performance: Current boat speed vs. target with percentage indicator
  • Headsail: Recommended sail (Genoa, Code 0, or Gennaker) based on true wind angle
  • Steering: Course corrections (Steady, Head Up, Bear Away)
  • Sail Trim: Sheet adjustments (Hold, Sheet In, Ease)

The app connects to Signal K, the open marine data standard, making it compatible with virtually any modern sailing instrumentation system.

How we built it

  • SwiftUI for the native iOS interface with custom Canvas-drawn sail icons
  • Gemini Live API with WebSocket bidirectional streaming for real-time voice interaction using native audio (PCM 16-bit, 16kHz)
  • Gemini 3 Flash Preview with Response Schema for guaranteed structured JSON output powering the visual coaching panes
  • Signal K integration with a built-in simulator for demonstration and testing
  • Combine framework for reactive data flow throughout the app

The visual coach leverages Gemini 3's new structured output capability with responseMimeType: "application/json" and responseSchema to guarantee valid, parseable recommendations every time.

Challenges we ran into

Audio format compatibility: Getting bidirectional audio working with Gemini Live required precise audio session configuration. We had to configure the AVAudioSession before creating the audio engine and ensure exact format matching (PCM 16-bit, 16kHz mono).

Gemini 3 "thinking" tokens: When we first implemented the visual coach with Gemini 3 Flash Preview, responses came back truncated—just "Here is the JSON requested:" with no actual JSON. We discovered Gemini 3 uses internal "thinking" tokens (~250 tokens) before generating output. With maxOutputTokens: 256, there was no room left for the actual response. Increasing to 1024 tokens solved this.

Wind angle calculations: Sailing physics is tricky. True Wind Angle (TWA) should always be greater than Apparent Wind Angle (AWA) when sailing upwind because the boat's forward motion shifts the apparent wind forward. Getting this right required careful vector mathematics.

Accomplishments that we're proud of

  • Seamless voice interaction: The Gemini Live integration feels like talking to a real coach—natural, responsive, and contextually aware of your sailing situation
  • Guaranteed structured output: Using Gemini 3's Response Schema, the visual coach returns valid JSON 100% of the time, with enum-constrained values that map directly to UI states
  • Real sailing intelligence: The recommendations aren't generic—they're based on actual performance polars and racing tactics (e.g., recommending to bear away and build speed when below 95% performance)
  • Production-ready architecture: Clean separation between services, reactive data binding, and persistent user preferences

What we learned

  • Gemini 3's structured output is powerful: The responseSchema feature eliminates prompt engineering gymnastics for JSON output. Define your schema once, get valid data every time.
  • Gemini Live opens new UX paradigms: Bidirectional audio streaming enables truly hands-free interfaces—critical for activities where users can't look at or touch their devices.
  • Token budgeting matters: With thinking models like Gemini 3, you need to account for internal reasoning tokens, not just output tokens.
  • Domain expertise + AI = magic: The AI becomes truly useful when grounded in real sailing knowledge—performance targets, tactical rules, and sail selection logic.

What's next for AI Sailing Coach

  • Apple Watch companion: Quick glance recommendations on your wrist
  • Race mode: Mark roundings, start sequence countdown, and tactical overlays
  • Learning system: Track performance over time and identify patterns in sailing technique
  • Multiplayer: Connect crew members' devices for coordinated maneuvers
  • Integration with autopilot: Automated course corrections based on AI recommendations
  • Weather routing: Long-passage planning with Gemini analyzing GRIB weather data

Built With

  • combine
  • figma
  • gemini-3-flash-api
  • gemini-live-api
  • signalk
  • swift
Share this project:

Updates