Inspiration

AURA was inspired by the gap between thought and action during focused work. We wanted an always-available assistant that can capture your voice, understand your context (clipboard + screenshot), and immediately execute the right tool flow without forcing you to switch apps or break concentration. What it does AURA is an overlay-based AI copilot that listens to voice or text commands, plans a tool graph, and executes it in real time. It can transcribe speech, read clipboard text/images, optionally capture screenshots, route requests through backend tools, stream results live, and return a final actionable output.

How we built it

We built AURA with an Electron + React + Vite frontend overlay and a FastAPI backend.

  • Frontend handles the floating UI, audio recording, and local integrations (clipboard/screenshot).
  • Backend handles ASR, planning (/router/plan), and graph execution (/execute/graph) with streamed events.
  • Data/context layers use MongoDB for operational state and ChromaDB for retrieval/memory use cases.
  • We connected everything through typed payloads and event-stream parsing to keep responses fast and interactive. ## Challenges we ran into
  • Synchronizing multiple real-time flows (recording, upload, planning, execution, streaming) without race conditions.
  • Handling unreliable edge cases gracefully (failed screenshot upload, partial stream chunks, empty audio).
  • Designing a compact overlay UX that still surfaces rich execution detail (intent, tools, stream, final output).
  • Keeping Electron boundaries secure while still exposing useful desktop capabilities. ## Accomplishments that we're proud of
  • End-to-end voice-to-execution workflow working inside a lightweight overlay.
  • Context-aware planning that combines voice, clipboard, and screen signals.
  • Live streamed execution feedback with clear final outputs.
  • Improved plan transparency by showing multiple planned tools as a readable list in the UI.

What we learned

  • Users trust AI systems more when execution plans are visible before/during action.
  • Multimodal context dramatically improves intent understanding, but demands strict payload and error discipline.
  • Small UX details (panel states, keyboard shortcuts, status indicators) have outsized impact on perceived speed and quality.
  • Building reliable AI products is as much about orchestration and resilience as model quality. What's next for AURA
  • Add richer tool-step visualization (status, duration, success/failure per node).
  • Introduce editable/confirmable plans before execution for high-impact actions.
  • Improve memory and personalization so AURA adapts to user workflows over time.
  • Expand integrations (calendar, docs, messaging, dev tools) and add stronger test coverage for streaming + ASR paths.

Built With

Share this project:

Updates