Inspiration
We noticed something kind of absurd: AI assistants keep getting smarter, but they still just... talk. You ask them to send an email and they give you text to copy. You ask them to fix a GitHub issue and they explain what you should type. At some point that's just a very expensive autocomplete. The real frustration though wasn't ours, it was watching non-technical people use these tools. A designer on our team wanted to push a small change, update a ticket, and draft a message to a client. Three platforms, two tutorials, and forty minutes later, she was done. That felt wrong. That's where Astrophage started.
What it does
Astrophage is a desktop companion that actually does things. You tell it what you want in plain language or by voice and it executes: drafting and sending emails, handling GitHub issues, managing files on your system, and more. It lives in your taskbar and integrates with your tools through MCP (Model Context Protocol), so it's not locked into one workflow. You can spin up multiple agents, each with their own personality and skill set. One handles communication, one stays focused on code. There's a Control Centre dashboard where you track tasks, view chat history, and configure everything. MongoDB keeps memory persistent across sessions, so it actually remembers context from previous conversations not just the current one. Voice input via ElevenLabs means you don't even have to type. And security is handled through ArmorIQ, which verifies tool calls before they execute.
How we built it
The app is built on Electron, so it runs cross-platform from a single codebase. The AI brain is Gemini, which handles intent understanding and decides which tools to call. We built local MCP servers to handle system-level operations file management, terminal sessions, and so on. External integrations (Gmail, GitHub, etc.) connect through the MCP protocol as well. MongoDB handles persistent memory across sessions. ElevenLabs handles voice transcription. ArmorIQ sits between the model and the tool calls to flag anything sketchy before it runs. The UI — the Control Centre, the chat interface, the character overlays on the taskbar is all built in vanilla JS/CSS inside the Electron renderer.
Challenges we ran into
Getting Electron to behave with MCP servers running in the background was messier than expected the IPC between main and renderer processes needed careful handling, especially with streaming model outputs. The security layer was genuinely hard to think through. When an agent can modify files or send emails on your behalf, "trust but verify" isn't good enough. Figuring out how ArmorIQ should intercept and validate tool calls without breaking the flow took a few design iterations. Persistent memory also caused some tricky edge cases. Context from old sessions sometimes confused the agent, and we had to build some cleanup logic to prevent stale memory from poisoning new conversations.
Accomplishments that we're proud of
The voice-to-action pipeline actually works end to end. Speak your intent, watch it execute. That felt like a big deal when it finally clicked. The multi-agent setup running Buddy for communication and a separate agent for code, both visible in the taskbar is something we hadn't seen done this way before. It's a small UX detail but it changes how it feels to use. We shipped a working build in 24 hours that could draft emails, manage GitHub issues, and modify local files from natural language. For a hackathon, that's the thing we're most proud of.
What we learned
MCP is genuinely good infrastructure for this kind of agent-tool connection. We came in skeptical about whether it would hold up under real use, and came out convinced. We also learned that UX is the hard part of agentic apps, not the AI. The model can figure out what to do. The challenge is making the user feel in control while it does things on their behalf — that's a design problem, not a model problem.
What's next for Astrophage - Udte Bhawre
A few things we want to tackle: More connectors Jira, Linear, Figma, Slack are already on the roadmap. The MCP architecture makes this straightforward to extend. A smarter memory system. Right now it's per-agent. We want cross-agent shared context so Buddy can hand off context to a specialist agent mid-task. A marketplace for agent personalities and skill configurations, so people can share setups that work well for their specific workflows. And honestly, just making the onboarding easier. Right now you need a MongoDB URI and a Gemini API key before anything works. That's fine for a hackathon demo but it's a wall for the non-technical users we're actually building for.
Log in or sign up for Devpost to join the conversation.