Inspiration
The frustration of the endless debug-test-fix cycle inspired VibeCI. Developers spend countless hours writing code, running tests, analyzing failures, applying fixes, and repeating often for a single feature. We asked: What if an AI could handle this entire loop autonomously? We envisioned a world where developers describe what they want, and an intelligent agent delivers verified, working code proving it works before a human ever sees it.
What it does
VibeCI is an autonomous code engineer powered by Google Gemini that takes a task description and independently:
- ๐ Analyzes the codebase and requirements
- ๐ Plans a minimal implementation approach
- ๐ ๏ธ Generates code patches (unified diffs)
- ๐งช Runs tests in isolated containers
- ๐ฌ Diagnoses any failures using test logs
- ๐ Iterates with fixes until all tests pass
- โ Produces verification artifacts (logs, diffs, screenshots)
All of this happens without human intervention until the task is complete.
How we built it
We built VibeCI with a modern full-stack architecture:
โข AI Engine: Google Gemini 3 Pro with structured JSON outputs for planning, patch generation, failure analysis, and fix generation
โข Backend: Node.js + TypeScript + Express orchestrating the autonomous loop
โข Frontend: React + Vite with a real-time trace viewer and glassmorphism UI
โข Database: SQLite for task and artifact persistence
โข Testing: Jest for unit tests, Playwright for E2E verification
โข Real-time Comms: WebSocket for live streaming agent thoughts and actions
The core innovation is our self-correcting orchestration loop the agent plans, generates code, runs tests, and if they fail, analyzes the logs and generates fixes automatically.
Challenges we ran into
โข Reliable diff parsing: Getting Gemini to generate valid unified diffs that apply cleanly to real codebases required extensive prompt engineering and structured output schemas โข Orchestration complexity: Managing the state machine of plan โ patch โ test โ diagnose โ fix with proper error handling and rollback was intricate โข Real-time UI sync: Streaming agent thoughts and events via WebSocket while keeping the UI responsive required careful architecture โข Production deployment: Configuring Heroku with proper git binary paths and environment variables for a monorepo presented unexpected hurdles
Accomplishments that we're proud of
โข โจ 90% time savings โ Tasks that took 30 minutes manually now complete in ~3 minutes
โข ๐ฏ 75% first-try success rate โ Most tasks complete in โค3 iterations
โข ๐ Thought Signatures โ Structured reasoning checkpoints for full auditability
โข ๐จ Premium UI โ Glassmorphism design with real-time trace viewer showing the agent "thinking"
โข ๐ End-to-end autonomous flow โ From task description to verified, working code with zero human intervention
What we learned
โข Structured outputs are crucial: JSON schemas make LLM outputs reliable and parseable
โข Self-correction beats single-shot: The iterative fix loop dramatically improves success rates
โข Transparency builds trust: Showing the agent's reasoning in real-time helps users understand and trust the system
โข Prompt engineering is an art: Small changes to system prompts have outsized impacts on output quality
โข Agentic AI needs guardrails: Rate limiting, sandboxing, and verification artifacts are essential for safe autonomous operation
What's next for VibeCI
โข GitHub PR Integration: Auto-create pull requests with verification artifacts attached
โข Multi-language Support: Extend beyond JavaScript/TypeScript to Python, Go, and more
โข Team Collaboration: Shared dashboards and task queues for development teams
โข Custom Prompt Templates: Let teams define their own coding standards and patterns
โข Enterprise Features: SSO, audit logs, and on-prem deployment options
โข Jira/Slack Integrations: Trigger tasks from issue trackers and get notifications in team chat
Built With
- archiver
- cors
- css
- docker
- dotenv
- express.js
- gemini
- git
- github
- heroku
- html
- javascript
- jest
- playwright
- react
- sqlite
- typescript
- uuid
- vite
- websockets
Log in or sign up for Devpost to join the conversation.