Mac PWA
Dashboard
Architecture DIagram

Aegis

The Problem With Autonomous Agents

Every AI agent today has the same fundamental flaw — it has no trust boundary.

You give it access to your Gmail, your GitHub, your calendar, your documents. It can send emails as you. Delete files. Merge code. Make purchases. And it can do all of this without a single checkpoint between intent and execution.

The industry has framed this as a binary choice: fully autonomous or fully manual. Either you trust the agent completely or you don't use it at all.

That's the wrong framing. And Aegis is the answer.

The Insight

Trust is not binary. It never has been.

When you hand your keys to a valet, you trust them to park the car. You don't trust them to take it on a road trip. The level of authorization is proportional to the consequence of the action.

Aegis brings this same logic to AI agents through a three-tier security model:

🟢 GREEN — Read-only, navigational actions. Execute silently.
🟡 YELLOW — Actions that modify state. Require verbal confirmation.
🔴 RED — Irreversible, destructive, or sensitive actions. Require biometric authentication via Touch ID or Face ID.

Every action the agent takes is classified in real time by an LLM classifier that evaluates intent and consequence, not just tool names. Sending an email is RED. Drafting one is YELLOW. Reading your inbox is GREEN.

How It Works

Aegis is a voice-controlled AI agent that runs locally on your Mac using the Gemini Live API. You speak naturally. Aegis listens, plans, and acts — but every action passes through the security gate before execution.

The gate routes actions by tier:

GREEN actions execute immediately with no interruption
YELLOW actions pause and narrate — Aegis tells you exactly what it is about to do and waits for your verbal yes
RED actions halt completely — your iPhone companion app receives an auth request, Face ID fires, and only a successful biometric authentication allows execution to proceed

The entire audit trail streams in real time to a web dashboard. Every action logged. Every auth request recorded. Full visibility into everything your agent did — and everything it was stopped from doing.

What I Built

Aegis Agent — Voice-controlled macOS agent powered by Gemini Live API and Composio tools (Gmail, Calendar, Docs, Sheets, Slides, Tasks, GitHub)
Three-Tier Classifier — LLM-based intent classifier that evaluates every action before execution
Security Gate — Routes actions to silent execution, voice confirmation, or biometric hold
Mobile Companion App — iPhone PWA that receives RED action auth requests and gates them behind Face ID via WebAuthn
Real-Time Dashboard — Web dashboard with SSE streaming showing live audit trail across all devices
GCP Backend — FastAPI on Cloud Run with Firestore for auth state, audit logging, and user management

The Stack

Gemini Live API for real-time voice. Gemini 2.5 Flash for classification and planning. WebAuthn for biometric auth. FastAPI on GCP Cloud Run. Firestore for state and audit logs. React PWAs for Mac, mobile, and dashboard.

Challenges

The trust problem is harder than it looks. Classification based on tool names doesn't work, the same tool can be GREEN or RED depending on context. Rewriting the classifier around intent and consequence rather than tool identity was the key architectural insight.

Biometric auth across devices is latency-sensitive. The RED action flow involves the local agent, the GCP backend, and the iPhone app all coordinating in real time. Getting this to feel seamless required careful SSE streaming and WebAuthn implementation.

Solo build. Every line of code, every deployment, every design decision, one person, one hackathon window.

What I Learned

The hardest problem in agentic AI isn't capability. It's trust. Users will delegate complex tasks to AI agents the moment they believe the agent cannot act beyond its authorization. Aegis is proof that graduated trust is not only possible it's the only architecture that makes autonomous agents safe to deploy in the real world.

What's Next

Aegis is the security primitive that every AI agent needs. The next step is making it model-agnostic and platform-agnostic — a trust layer that any agent can plug into, regardless of the underlying model or tool suite.

Built With

docker
fastapi
firestore
flash
gemini-live
github-actions
google-live
pyobjc
python
react
webauthn

Updates

Harshit Singh Bhandari started this project — Mar 16, 2026 03:35 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.