VoiceDesk

Developer Week Hackathon 2026

Inspiration

During a late-night work session, I found myself switching between Trello, Google Calendar, Notion, and Slack — performing the same repetitive task: typing. In 2026, it felt wrong that the fastest way to get something done across my tools was still manual copy-pasting and text entry. I envisioned a single interface where I could just speak — naturally — and have those words translate into immediate actions. That quiet frustration became the foundation for VoiceDesk.

What it does

VoiceDesk is a real-time voice operator that turns natural speech into immediate actions across your entire productivity stack. By eliminating menus, manual typing, and context switching, it allows you to manage your workflow hands-free.

You simply say:

“Schedule a demo with the team next Tuesday at 2 pm” → Google Calendar event created. “Add a task to prepare the slides” → Trello card created. “Take a note on the new pricing tier” → Notion page saved. “Tell the team I’ll be 10 minutes late” → Slack message sent. VoiceDesk also acts as an information hub, answering queries like “What’s on my calendar today?” or “Show me my pending tasks” using real-time data retrieval.

How we built it

VoiceDesk was built as a solo project using Deepgram’s Voice Agent API as the core intelligence engine.

Real-time bidirectional streaming: Managed via Socket.IO to ensure audio and data flow seamlessly. Intelligent Execution: Intent recognition and function calling are executed server-side to maintain speed and security. Frontend Design: A React + Tailwind interface featuring five deliberately distinct page designs to provide a professional, dashboard-like experience. Zero-Setup UX: A full offline/demo mode was implemented so judges and users can experience the functionality instantly without needing personal API configurations.

Challenges we ran into

The primary hurdle was achieving sub-second perceived latency while maintaining 99%+ intent accuracy across casual, non-scripted speech patterns. The final architecture utilizes Deepgram’s live transcription with finely tuned utterance detection and streaming audio chunks. This resulted in an end-to-end response time of approximately 750ms, making the conversation feel truly natural.

Accomplishments we're proud of

Zero-config demo experience: Designed to work out of the box for anyone, anywhere. Visual Versatility: Five unique, production-quality screens that go beyond a simple chat interface. Robust NLU: Natural language understanding that handles varied phrasing without requiring rigid templates. Reliable Fallbacks: Seamless transition to mock mode when external APIs are not configured, ensuring the app never "breaks" during a demo. Mobile-first responsive design: Ensuring productivity isn't tethered to a desktop.

What we learned

Voice is the ultimate interface when latency disappears and actions become real. Deepgram’s Voice Agent API removes the traditional barriers that have held voice applications back for years: reliability and instant execution. I learned that for voice to be adopted, it can't just "talk"—it has to "do."

What's next for VoiceDesk

Browser Extension: Highlight any text on any site and simply say “VoiceDesk this” to process it. Mobile Companion App: Bringing the power of the voice operator to users on the go. Team Context & Memory: Allowing the agent to remember project details and team preferences for even faster interactions. Enterprise Deployment: Implementing shared voice operators for collaborative environments. VoiceDesk is not just another voice demo; it is a practical replacement for typing in the modern workplace.

Thank you, Developer Week 2026, for the opportunity to show what’s possible when voice finally works.

Built With

Share this project:

Updates

posted an update

I’m currently working on integrating the Deepgram API. I’ve initialized the setup and am now exploring how its different features fit into the overall workflow. Since Deepgram offers multiple capabilities (real-time streaming, transcription options, etc.), it’ll take a couple of days to fully understand the best approach and adapt the code structure accordingly. Once this is clear, I’ll move forward with deeper integration and optimization.

Log in or sign up for Devpost to join the conversation.