Select symbolic scopes through speech
Navigate seamlessly
Make scoped edits with your voice
See your updates apply in real time
Set your api keys
Update your settings to your taste
Ask questions

Inspiration

Why we built Vocode

The day before this hackathon, I cut my finger badly enough that I literally couldn’t type.

For most people, that’s just an inconvenience. But as a programmer, it completely blocked my ability to build. I found myself thinking: why is programming still so dependent on a keyboard?

That question became more meaningful when I thought about my uncle. He spent over 20 years at Microsoft as a programmer, but now lives with neuropathy. Typing for extended periods causes him real pain—making it difficult to continue doing something he loves.

That’s when the idea for Vocode clicked.

What if programming didn’t require typing at all—and even better, what if you could code faster than typing?

What it does

What is Vocode?

Vocode is a voice-driven, agentic code editor that lets you write, navigate, and edit code, ask questions, and manage filesystem using natural speech—at the speed of conversation.

Instead of manually typing syntax, you can say:

“Go to the enemy loop. Add a bounds check before accessing the array.”

Vocode understands your intent and applies precise edits directly to your codebase.

But what makes Vocode fundamentally different is that it’s not vibecoding.

Most AI coding tools generate large chunks of code for users who may not fully understand or control the result. That approach is powerful—but it takes the developer out of the driver’s seat.

Vocode takes the opposite approach:

You stay in control of the architecture and decisions
The AI executes small, scoped edits based on your intent
Each interaction is precise, composable, and reversible

The workflow becomes:

Resolve a scope (“go to the enemy loop”)
Apply a change within that scope (“add a bounds check”)

Smaller, structured edits enable more precise expression than large code generation ever could.

The result is a new paradigm:

Not typing code
Not generating code blindly
But directing code at the speed of speech

This is vocoding

How we built it

Vocode is built as a modular, multi-process system designed for real-time voice interaction with code.

Core stack:

Monorepo powered by a pnpm turborepo with Biome and GitHub CI
Frontend built as a VS Code extension using TypeScript and React (with support for additional frontends in the future)
Landing page built with React, Vite, and Tailwind hosted on Vercel

Voice pipeline:

A dedicated voice daemon written in Go handles microphone input
Uses cgo with PortAudio for low-level audio capture
Custom Voice Activity Detection (VAD) determines when the user is speaking
Audio is streamed to ElevenLabs Scribe v2 for real-time speech-to-text

AI + orchestration:

A separate core daemon (Go) manages AI interaction
Supports multiple providers (e.g. OpenAI, Anthropic)
Communication between components happens over duplex JSON-RPC

Edit system (core innovation): Vocode uses a two-step, scope-based editing model:

Scope Resolution The user specifies where to operate
“Find the main function”
Project-wide search returns matches
Scoped Modification The user specifies what to change
“Make it do X”

A selection window shows all matches, and users can navigate between them using voice before applying changes.

This separation of where and what enables precise, composable edits—making voice-driven programming both fast and controlled.

Challenges we ran into

Building a real-time, voice-driven coding system introduced challenges across multiple layers:

Coordinating distributed components Managing communication between the VS Code extension, voice daemon, and core AI daemon—especially over JSON-RPC—required careful orchestration and debugging.
Designing a robust intent system Not all speech is equal. We had to distinguish between:
- Code edits
- Navigation commands
- File/text search
- General questions to the AI
- UI control actions

Handling these different intent flows reliably—while keeping the experience seamless—was a major challenge.

Dealing with noisy and ambiguous input Speech transcripts are often imperfect. We had to handle irrelevant or partial input while still extracting meaningful intent.
Low-level audio processing Implementing microphone input via PortAudio (through cgo), building a custom VAD system, and streaming audio efficiently to ElevenLabs required working close to the hardware layer

Accomplishments that we're proud of

Built a working speech → intent → structured code edit pipeline
Demonstrated real-time coding at the speed of speech, not just transcription (*api latency still exists)
Created a system where developers remain fully in control, not replaced by AI
Designed a scope-based editing model that enables precise, composable changes Introduced a new paradigm beyond “vibecoding” ## What we learned
Translating natural language into precise, structured code edits is far more complex than generating code
Separating scope (where) from intent (what) dramatically improves reliability and usability
Voice interfaces require fundamentally different UX patterns than traditional developer tools
Accessibility-driven ideas can lead to fundamentally better tools for everyone
Building at both the systems level (audio, daemons) and AI level (intent interpretation) requires careful boundary design ## What's next for Vocode Vocode introduces a new paradigm: intent-driven programming.

We’re not building a better autocomplete—we’re redefining how developers interact with code.

Expanding the intent system

More expressive and reliable scoped operations

Deep codebase awareness

Improved Conversation

Seamless and fluid real-time interaction and conversation with your codebase

Advanced multi-step edits and refactoring

Coordinated changes across files using structured intent
More LSP-based tools such as "extract this into a reusable module"

Personalized coding agents

Learning a developer’s style, patterns, and preferences
Adapting to code style and project conventions

Multiple supported frontends

Integrations for other IDES
Vocode IDE
Vocode Web
Connect to vocode on your machine with Vocode Mobile

Long term, we believe Vocode defines a new category:

Not vibecoding (large, opaque generation)
Not traditional editing (manual typing + autocomplete)
But intent-driven programming

A world where:

You think in logic
You speak your intent
And your code evolves instantly

Code at the speed of speech—without giving up control.

Built With

anthropic
elevenlabs
go
openai
portaudio
react
typescript
vscode-extension

Updates

Spencer Smith started this project — Apr 04, 2026 01:59 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.