posted an update

Director Chat: Zero Extra API Calls

The Director Chat feature went through three major architecture rewrites. The final version uses Gemini Live API's native capabilities to eliminate all extra API calls per interaction:

  • Native input/output audio transcription (replaced 2 separate STT calls)
  • Native function calling via generate_story tool (replaced external intent detection)
  • Context window compression with SlidingWindow (replaced manual conversation trimming)

The result: the model having the conversation makes the generation decision atomically — no race conditions, no external classifiers guessing from lossy transcripts. The manual Suggest button remains as a fallback for the ~30-40% of cases where tool calling doesn't fire in audio mode.

This was our biggest lesson: before building workarounds, check if the API you're already using has native support. The platform team usually thought of it first.

Log in or sign up for Devpost to join the conversation.