RiskBox

Inspiration The theme of this hackathon is "Building what AI agents want." Right now, autonomous agents are incredibly capable, but they lack a fundamental piece of engineering infrastructure: a staging environment. When an agent is asked to modify authentication code, approve a contractor's PR, or issue a stablecoin payment, it usually has to do it "live" in production. If it hallucinates or lacks context, the blast radius is catastrophic. We realized that what AI agents want is a safe place to dry-run their actions. They need a preflight checklist to catch policy violations, codebase conflicts, and payment errors before real damage occurs. What it does Agent Preflight Sandbox is a callable API tool designed specifically for AI agents. Before an agent takes a high-risk action, it sends a proposed task payload to our API. Our tool evaluates the blast radius across three dimensions: Context & Policy Risk: Does the action violate internal rules? Codebase Risk: Will this change break the broader repository? Financial Risk: Is this payment valid and compliant? The tool acts as a risk router, evaluating the proposed action and returning a structured JSON verdict (SAFE, NEEDS_APPROVAL, or BLOCKED) along with a risk score and an explanation. It gives agents the self-awareness to pause and escalate tasks to a human when things look dangerous. How we built it We built the orchestration engine using a modern Python (FastAPI) backend, treating the sandbox as an API-first tool. We built a minimal Next.js dashboard for human operators to audit the agent's run history. To evaluate the risks, we deeply integrated three hackathon sponsors: Nia by Nozomio: We use Nia's indexing and search API to retrieve live documentation, internal policies, and framework constraints so the sandbox is grounded in real-world rules. Greptile: If the agent proposes a code change, we route the diff and repository URL to Greptile's API to understand the downstream repo-wide impact, rather than just looking at isolated files. AllScale: If the agent proposes an action involving money (like paying a contractor for merged work), we use AllScale to simulate the invoice and payment flow, validating settlement and compliance risk. LiteLLM + CLōD: We used LiteLLM as our AI Gateway to route the raw findings from Nia, Greptile, and AllScale into an LLM (powered by Claude/CLōD credits), which synthesizes the data into the final risk_score and verdict. Challenges we ran into Building an orchestration layer that waits for three distinct APIs—each analyzing completely different types of data (text, code graphs, and financial ledgers)—was challenging. We had to design strict Pydantic schemas in FastAPI to normalize the outputs from Nia, Greptile, and AllScale so the final LLM synthesis step didn't get confused by conflicting API shapes. Accomplishments that we're proud of We successfully shifted our mindset from "building an app for humans" to "building infrastructure for agents." Instead of building a flashy UI that a human clicks through, we built a modular API endpoint that an existing agent (like Cursor) could technically call in the background. We are especially proud of how seamlessly the three sponsor tools compose together to form a single, holistic safety check. What we learned We learned that multi-agent architecture isn't just about agents talking to each other; it's about giving agents the right tools to evaluate their own confidence. We also learned how to leverage an AI Gateway (LiteLLM) to separate our app logic from the underlying model providers, making the backend much cleaner. What's next for Agent Preflight In the future, we want to expand the sandbox from a "dry-run evaluator" into a full virtual execution environment using isolated containers. We also plan to integrate with agent marketplaces (like Clustly), so that if our tool returns BLOCKED due to low confidence, the task is automatically delegated and posted as a bounty for a specialist agent to solve.

Built With

Updates

Private user started this project — May 10, 2026 09:07 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.