Inspiration
80-95% of enterprise AI PoCs fail to reach production. Working on AgentGuard — a Python SDK for AI agent observability — we realized the governance problem is not just technical. Delivery managers, project leads, and non-technical stakeholders need to understand agent readiness too. We built AgentGuard Lite to make AI governance accessible to every team member, not just engineers.
What it does
AgentGuard Lite is a four-screen AI agent governance studio for enterprise teams:
Readiness Check — An 8-question assessment that generates a deterministic Production Readiness Scorecard across 5 dimensions: Observability, Cost Control, Governance, Testing, and Data Security. Each dimension is scored by rule-based logic (same input always gives same score), with AI-generated recommendations. Includes an Indian IT Context section flagging data residency compliance for BFSI and government clients.
Cost Calculator — Estimates daily and monthly costs for AI agent fleets across GPT-4o, Claude Sonnet, Llama 3.1, and Gemini Flash. Shows costs in both USD and INR. Triggers a budget warning when projected monthly cost exceeds $100.
Scenario Tester — Simulates agent behavior using MeDo's LLM skill, validates output against expected keywords, and checks estimated cost against a configured budget limit. Shows PASS/FAIL with keyword-level analysis.
History — Saves every assessment to localStorage. Teams can track governance score improvements over time. Shows total assessments and average score across all runs.
How we built it
Built entirely with MeDo's Deep Build mode in one session:
- Submitted a single detailed prompt describing all three screens, navigation, LLM integration, scoring logic, and design requirements
- MeDo generated a requirements document for confirmation before building
- MeDo generated the complete full-stack application including navigation, form logic, LLM skill integration, calculation engine, chart rendering, and responsive design
- MeDo detected and automatically fixed a streaming API error in the LLM integration layer
- Follow-up prompts added: deterministic scoring rules, Indian IT Context section, and History tab with localStorage persistence
The most impressive MeDo capability used: the LLM skill integration that takes 8 free-form answers and returns structured dimension scores with specific recommendations in real time.
Challenges we ran into
- MeDo initially used LLM-generated scores which were non-deterministic. Solved by adding rule-based scoring logic through a follow-up prompt while keeping the LLM for recommendation text only.
- The Governance rule needed case-insensitive matching for "Nobody" vs "nobody". Fixed with a targeted fix prompt.
- Balancing depth of features against credit consumption with limited credits available.
Accomplishments that we're proud of
- Deterministic scoring that gives consistent, trustworthy results — not random LLM outputs
- Indian IT Context section that directly addresses data residency requirements for BFSI and government clients in India
- History tracking that turns a one-time check into a governance practice
- The entire app was built, debugged, and extended through conversation with MeDo — zero manual coding
What we learned
MeDo's Deep Build mode is genuinely capable of generating production-grade application logic, not just UI scaffolding. The key is giving it a precise, structured prompt with explicit business rules. Vague prompts produce vague apps. Specific prompts produce specific apps.
Multi-turn iteration is MeDo's most powerful feature — being able to extend and fix a generated app through follow-up conversation without rebuilding from scratch.
What's next for AgentGuard Lite
- Export scorecard as PDF for client reporting
- Team sharing — save assessments to a shared workspace, not just localStorage
- Connect to the AgentGuard Python SDK for real agent trace data instead of simulated responses
- Benchmark database — compare your agent's score against industry averages for your sector
Built With
- javascript
- llm
- medo
- supabase
Log in or sign up for Devpost to join the conversation.