Inspiration

We've all been on calls where we needed information fast. A calendar check, a fact lookup, a suggestion on what to say. But we couldn't break away to search for it. Existing assistants require you to leave the conversation, open an app, and context-switch back. By then the moment has passed.

We wanted an assistant that lives inside the phone call itself. Not something you switch to. Something that's just there, listening, waiting for its name.

What it does

Mijo is a voice copilot that joins a live phone call as a third participant. You dial Mijo's number, merge the call, and he listens. He only activates when you say his name.

  • Whisper assist. "Mijo, what should I say?" gives you a coached response based on the conversation so far.
  • Calendar and email. Pulls live data from your Google or Microsoft account.
  • Web search. Real-time answers powered by Exa.
  • Notes and summaries. Captures notes mid-call and recaps everything on demand.
  • Stealth mode. "Mijo, go stealth" routes all responses as SMS instead of speaking aloud. The other party hears nothing.

How we built it

Three layers:

Twilio handles the phone number, call routing, media streaming, and outbound SMS for stealth mode. ElevenLabs Conversational AI is the voice brain. It handles speech recognition, turn-taking, LLM reasoning, and speech synthesis. The system prompt controls wake-word activation, response brevity, and personality. Exa powers real-time web search when the agent needs to look something up.

We built a Next.js web app for onboarding and connecting Google and Microsoft accounts, backed by Supabase for user profiles, call logs, and OAuth token storage. The telephony service runs on Fastify, handling Twilio webhooks and media stream connections to ElevenLabs.

Challenges we ran into

Getting wake-word activation reliable was the hardest part. "Mijo" can sound like "mi hijo" or "me though" in noisy call audio, so we had to heavily tune the system prompt to prefer silence over false activations. Stealth mode routing was also tricky. Deciding whether to speak or SMS based on mode state mid-tool-call created edge cases, especially when the user switched modes while a tool call was still in flight.

Accomplishments that we're proud of

Stealth mode is the standout feature. Being able to say "Mijo, go stealth" and have every response silently arrive as a text message while the other party on the call has no idea. That felt like magic the first time it worked.

What we learned

ElevenLabs handles turn-taking and barge-in surprisingly well out of the box. The system prompt turned out to be the most powerful control surface. Small wording changes had outsized effects on response quality, brevity, and false activation rates. We also learned that on live calls, silence is a feature, not a bug.

What's next for Mijo

Multi-language support so Mijo can speak Spanish too. Proactive nudges that silently text you context before you even ask. And deeper integrations like CRM lookups, live transcription, and post-call action item tracking.

Built With

Share this project:

Updates