Frenlin — Your AI Conversationalist Aid

Inspiration

Every AI coding tool on the market is the same polite autocomplete that waits silently for you to type. But coding alone at 2 AM isn't a typing problem — it's a focus and motivation problem. We wanted something that actually talks back, holds you accountable, and feels less like a tool and more like an intense, weirdly supportive friend sitting next to you. So we built a voice-to-voice companion with the energy of Frieren/Vegeta/Zoey and the attentiveness of a productivity coach.

We named it Frenlin — a blend of friend and gremlin. The "fren" is the point: this thing is genuinely on your side, the late-night buddy who wants you to ship. The "gremlin" is the honesty: it's a little chaotic, it talks back, and it won't let you doomscroll between tabs in peace. The name keeps the warmth and self-aware comedy we cared about while making it something we'd actually say out loud — and put on our GitHubs.

(Fun fact: this project had a very different name when we first built it. We rebranded to Frenlin so the repo, the extension, and our dignity could finally share one identity.)

What it does

Frenlin is a VS Code companion you talk to and that talks back, out loud, in character. It listens for natural pauses (no push-to-talk), responds with an ElevenLabs voice, and acts as a genuine productivity partner. It manages a persistent todo list through plain speech ("add fix the auth bug," "mark the refactor done"), enforces focus by calling you out when you flick between tabs without editing, intervenes when you've been frozen on a sensitive .env file too long, stays aware of your active file and language, and celebrates real wins with confetti or drops you a working resource link when you ask for docs.

How we built it

A TypeScript VS Code extension drives the webview UI (HTML/CSS + Web Audio API for the live waveform and mic), backed by a Python Flask server. ChatGPT (gpt-4o-mini) generates a single structured JSON reply — spoken text, an emotion, optional todo actions, and optional UI widgets — so one round-trip can talk and drive the interface. Whisper handles transcription and ElevenLabs handles text-to-speech. Focused service modules (TodoManager, WorkspaceContextService, FocusMonitor, EnvMonitor) own the productivity logic, and a stack-based widget manager swaps the todo board for celebration/resource widgets and restores it automatically.

Challenges we ran into

Natural turn-taking was brutal — getting voice activity detection to know when you've actually stopped talking versus just pausing, without cutting you off or hanging for ten seconds. Audio playback fought us constantly (suspended AudioContexts silently swallowing the voice, the mic re-arming before the AI even started speaking). We migrated the whole brain from Gemini to ChatGPT mid-project for cost and reliability. And the resource links the model generated were frequently hallucinated 404s — so we added server-side URL validation with a smart search fallback.

Accomplishments that we're proud of

A fully conversational, hands-free experience that feels like a real back-and-forth. A genuinely useful todo system you control entirely by voice. Proactive, in-character interventions (focus and .env safety) that respect cooldowns instead of spamming you. And resource links that always open something real, never a dead end.

What we learned

Voice UX is a different beast — latency and turn-taking matter more than raw model quality. Structured JSON output beat multi-round tool-calling for our needs: simpler, faster, and easier to maintain. And the difference between a "tool" and a "partner" is mostly personality and proactivity — knowing when to speak up unprompted.

What's next for Frenlin

More companion personalities and custom voice cloning, smarter context (git state, test results, build errors), pomodoro and break enforcement, mood-aware tone shifting, and team mode so your companion can hype up an entire squad — not just one developer at 2 AM.

Built With

Share this project:

Updates