Agent Brief

Inspiration

AI agents usually fail for boring reasons: vague instructions, messy repos, and unclear rules about money, email, and irreversible actions. We kept seeing “run this in Cursor” moments where a few missing constraints could turn into a bad booking, a bad refactor, or a leaked assumption pulled from stale docs. There was no quick pre-flight step that treated the task and the workspace as one safety surface. Agent Brief is that checklist: turn a messy human request plus real project context into something an agent can execute without guessing.

What it does

Agent Brief is a local web app (Next.js on localhost) that runs beside your repo. It:

Scans the workspace (docs and configs, with sensible skips like node_modules, .git, and .env) to understand what an agent might read and trust.
Analyzes your task for ambiguity, missing constraints, and risky permissions.
Produces a structured pre-flight report: readiness scores, expandable “context nutrition” rows (why, evidence, fixes), Safety Issues (with an “Agent OSHA” flavor so risks stick in memory), an approval queue, a human-readable work order, a receipt template, and a Copy for Cursor handoff that packages the execution contract for your agent.
The pitch: messy request + messy workspace → safe, explicit work order.

How we built it

We used Next.js App Router for UI and API routes in one package, with a workspace scanner module that walks the tree, caps depth and size, and concatenates file contents with clear headers for the model. The /api/analyze route sends a single structured prompt to CLōD (OpenAI-compatible API) using DeepSeek V3, with stream: true so the UI can render sections as JSON arrives. The client parses the stream, progressively fills score cards, nutrition rows, safety issues, approvals, work order, and receipt. Pre-flight resolution (resolving safety items and answering approvals) updates the work order client-side so the final brief matches user choices without a second LLM call for the MVP. Styling follows a dark, product-style shell (Inter + JetBrains Mono, resizable two-panel layout, demo presets for quick demos).

Challenges we ran into

Structured output over streaming: getting consistently parseable JSON while streaming required careful handling of partial chunks and fallbacks when the model drifts.
Context limits vs. useful workspace signal: balancing how much of the repo to include without blowing tokens or leaking secrets (hence skips, caps, and optional extra context).
Making “safety” actionable: scores alone are not enough; we needed expandable evidence, fix text, and patches that actually change the work order.
Hackathon time: we prioritized the end-to-end demo path (scan → analyze → resolve → copy) over persistence, auth, and automated tests.
The team is a solo competitor so he ended up running into cursor limit

Accomplishments that we're proud of

A workspace-aware flow that is not just “rewrite my prompt” but audits environment + task together.
The work order as the product: a readable execution contract (goal, allowed/blocked actions, approvals, missing info, success criteria, receipt) instead of a wall of JSON in the main UI.
Streaming UI that feels alive in a demo and matches how people expect modern AI tools to behave.
Copy for Cursor as a practical handoff: zero integration magic, but immediate usefulness for real workflows.

What we learned

Most agent failures are contract failures: unstated permissions, unstated “done,” and unstated sources of truth.
Filesystem context changes the quality of risk detection dramatically compared to prompt-only tools.
For a short build window, one strong LLM pass + deterministic client updates beats two fragile round-trips.

What's next for Agent Brief

Stronger validation of streamed JSON and richer error recovery.
Tests for the workspace scanner (mock FS) and snapshot tests for prompt assembly.
Optional file watch or re-scan on demand, history of briefs, and tighter Cursor-oriented formats.
Exploration of multiple providers and tighter guardrails for enterprise-style policies—still with local-first, privacy-conscious defaults.

Built With

clod
nextjs

Updates

Victor Tran started this project — May 10, 2026 08:33 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.