Bench

Tech Stack Diagram
Demo
Docker Implementation

Inspiration

AI is cool. What's cooler? AI but SMARTER (& faster). Bench generates multiple implementations, verifies them with real execution, and help developers make better engineering decisions.

I highkey didn't know my own teammates names cause I spent too long going back and forth with AI. We wanted AI to stop acting like a chatbot and start using it to its full potential without having to fix every other hallucination.

What it does

Bench generates multiple implementations, compares them side by side, verifies them with real tests, and helps developers choose the best one without leaving VS Code. It's an assistant that isn't confined to sequential processes!

Bench is a VS Code extension that uses Cerebras for fast inference, a knowledge graph + vector database for smart context compression and FastAPI for backend backend. We have Docker running benchmarks and tests.

It's a new workflow entirely. We shifted from manual coding to agentic coding back when LLMs came out. We take the next step and take advantage of the latest technological advances in Cerebras's low-latency inference, Docker-backed verification and Backboard's optimizations.

How We Built It

We built Bench as a VS Code extension with a TypeScript webview for chat, candidate cards, logs, and preview controls. When a developer asks for a feature, the extension sends the prompt and editor context to a local FastAPI orchestrator. The orchestrator summarizes the repo, calls Cerebras to generate implementation candidates, and can use Backboard to remember prior decisions.

To test the candidates, the extension sends them to a separate local FastAPI daemon that runs each one inside Docker against deterministic fixtures. Results stream back into VS Code with pass/fail status, logs, test counts, timing, and a ranked winner. From there, the developer can preview the chosen code as unsaved editor changes before applying it.

Built With

backboard
cerebras
docker
fastapi
python
typescript

Updates

Dorothy Zheng started this project — Jun 27, 2026 03:30 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.