Jupiter AI — Trust Your AI Before It Reaches Production
💡 Inspiration
Engineers solved LLM indeterminism. The actual blocker is governance.
PERCEIVED BARRIER REAL BARRIER
────────────────── ──────────────────────────────
Hallucinations → Data leaks & policy violations
Formatting errors → No data lineage or access limits
Response variability → Audits are manual & retrospective
→ Zero real-time enforcement
AI agents now have read/write access to everything:
┌──────────────────┐
│ AI Agent / RAG │
└────────┬─────────┘
│
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ Customer PII│ │ Financial │ │ Internal │
│ & PHI │ │ Records & │ │ Policies & │
│ │ │ Payments │ │ IP / Docs │
└─────────────┘ └──────────────┘ └─────────────┘
▼ ▼ ▼
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ CRM / ERP │ │ Gov Data │ │ Proprietary│
│ Platforms │ │ Sources │ │ Documents │
└─────────────┘ └──────────────┘ └─────────────┘
Yet governance audits remain manual, slow, and reactive — teams inspect < 1% of agent traces, and only after a breach.
⚙️ What It Does
Jupiter autonomously discovers, models, tests, and audits enterprise AI systems.
CONNECT AI SYSTEM
│
▼
┌─────────────────────┐
│ 1. DISCOVERY │ Conversational agent interviews stakeholders
│ Purpose · Models │ Extracts: architecture, tools, APIs, data
│ Tools · APIs · Data│
└────────┬────────────┘
│
▼
┌─────────────────────┐
│ 2. KNOWLEDGE GRAPH │ Maps system entities → governance ontology
│ Risks · Controls │ CRM Tool → CRM Type · Email → PII Asset
│ Regs · Policies │
└────────┬────────────┘
│
▼
┌─────────────────────┐
│ 3. DIGITAL TWIN │ Governance-aware model of the AI system
│ Tools · Assets │ Covers: data flows, controls, dependencies
│ Flows · Risks │
└────────┬────────────┘
│
▼
┌─────────────────────┐
│ 4. POLICY PARSING │ Converts policies → machine-testable rules
│ Policy → Control │
│ → Asset → Risk │ "No customer data disclosure"
└────────┬────────────┘ → Control · Protected Asset · Risk
│
▼
┌─────────────────────┐
│ 5. TEST GENERATION │ Every test traced to a policy
│ Policy-driven, │ Policy → Control → Risk → Objective → Prompt
│ not generic │
└────────┬────────────┘
│
▼
┌─────────────────────┐
│ 6. TEST EXECUTION │ Systematic, repeated adversarial testing
│ PII Leakage │
│ Prompt Injection │
│ Tool Abuse │
│ Data Exfiltration │
└────────┬────────────┘
│
▼
┌─────────────────────┐
│ 7. OBSERVABILITY │ Integrates with platforms (e.g. Arize Phoenix)
│ Tool calls │ Captures: traces, outputs, docs, metadata
│ Reasoning traces │
└────────┬────────────┘
│
▼
┌─────────────────────┐
│ 8. ANALYSIS │ Root cause for every finding
│ Which policy failed│ Which tool exposed · Which doc leaked
│ Which control broke│
└────────┬────────────┘
│
▼
┌─────────────────────┐
│ 9. REPORT │ Auditor-ready, evidence-backed output
│ Risk · Violations │ Root cause + remediation included
│ Controls · Fixes │
└─────────────────────┘
🏗️ How We Built It
┌─────────────────────────────────────────────────────────────┐
│ JUPITER STACK │
├──────────────────┬──────────────────┬───────────────────────┤
│ DISCOVERY LAYER │ REASONING LAYER │ EXECUTION LAYER │
│ │ │ │
│ Conversational │ Governance │ Test Runner │
│ Discovery Agent │ Knowledge Graph │ Adversarial Engine │
│ │ │ │
│ Stakeholder │ Digital Twin │ Arize Phoenix │
│ Interview Flow │ Generator │ Observability │
└──────────────────┴──────────────────┴───────────────────────┘
│
▼
┌──────────────────┐
│ REPORT ENGINE │
│ Policy → Finding│
│ → Evidence │
│ → Remediation │
└──────────────────┘
Key design choices:
- Discovery-first before any test generation — no generic scans
- Ontology-based mapping for cross-customer governance reasoning
- Every test traceable:
Policy → Control → Risk → Prompt - Integrated observability for why, not just what
🚧 Challenges We Ran Into
CHALLENGE HOW WE NAVIGATED IT
────────────────────────────── ──────────────────────────────────────
Modeling diverse AI architectures Built flexible ontology with generic
across customers entity types that map to any system
Converting policy language into Structured Policy → Control → Asset
machine-executable tests → Risk pipeline with LLM parsing
Avoiding false positives in Governance-aware context filtering
governance violation detection tied to digital twin, not raw output
Keeping audit tests relevant as Continuous discovery refresh tied
AI systems evolve rapidly to system change signals
🏆 Accomplishments We're Proud Of
┌──────────────────────────────────────────────────────────┐
│ ✅ Built end-to-end: Discovery → Twin → Test → Report │
│ ✅ Policies auto-converted into executable test cases │
│ ✅ Every finding linked to Policy → Control → Evidence │
│ ✅ Observability integration capturing full traces │
│ ✅ Governance reports ready for real compliance reviews │
└──────────────────────────────────────────────────────────┘
Transformed governance from a static checklist into a living, continuous audit layer.
📚 What We Learned
ASSUMPTION REALITY
──────────────────────────── ──────────────────────────────────
Governance = security scanning → Governance = reasoning over policy,
risk, assets, and controls together
Generic red team prompts work → Tests must be traceable to policy
or findings are meaningless
Observability = monitoring → Observability must answer *why*,
not just log *what* happened
One-time audit is sufficient → AI systems evolve daily; governance
must be continuous, not periodic
🚀 What's Next for Jupiter AI
┌────────────────────────────────────────────────────────────────┐
│ ROADMAP │
├────────────────────┬───────────────────────────────────────────┤
│ NEAR TERM │ Expand regulation coverage │
│ │ GDPR · HIPAA · SOC2 · EU AI Act │
├────────────────────┼───────────────────────────────────────────┤
│ MID TERM │ Real-time guardrail enforcement │
│ │ Block violations before they complete │
├────────────────────┼───────────────────────────────────────────┤
│ LONG TERM │ Governance OS for enterprise AI │
│ │ Continuous assurance across all agents │
└────────────────────┴───────────────────────────────────────────┘
Not replacing governance teams — giving them an autonomous auditor that continuously understands, tests, and validates AI at scale.
NOTE: Currently It is fune tuned for fintech agents only.
Jupiter AI — From AI Observability to Autonomous AI Governance
Log in or sign up for Devpost to join the conversation.