Project Story
Inspiration
I just wanted my own code reviewer. That's it.
Tools like Claude Code and Codex are amazing for writing code, but using them for review? That felt like burning tokens I could spend on actually building things. And let's be real, most of what I'm shipping these days is AI-generated anyway, so having AI review AI-written code with the same expensive tool felt... off.
But here's the thing - code review matters now more than it ever did. We're generating code faster than we can understand it. Even without AI, humans write bugs, we always have. But now we're writing bugs at 10x speed. Solo developers almost never get their code reviewed. Privacy, fear of judgment, or just nobody around to do it. I've been there.
So I wanted something simple. Run one command, get a review. Only the diff goes to the AI provider, the rest stays on your machine. Use free-tier providers so there's zero excuse not to run it before pushing.
That's how Diffgazer started - as a "what if I just built this for myself" kind of thing, during the Gemini API Developer Competition.
What it does
Diffgazer is a local AI code review tool. You run diffgazer in your terminal, it starts a server, opens a browser, and reviews your git diff.
It has five review lenses, each is a separate AI agent with its own prompt:
- Correctness - logic errors, edge cases, null handling, race conditions
- Security - OWASP Top 10, injection, XSS, auth bypass
- Performance - N+1 queries, memory leaks, algorithmic complexity
- Simplicity - over-engineering, dead code, naming issues
- Tests - missing tests, brittle tests, flaky patterns
Issues are ranked: Blocker, High, Medium, Low, Nit. You can click any issue to get a drilldown - root cause, impact, suggested fix, patch. The review streams in real time so you can watch agents work. Git blame shows who last touched the affected code (spoiler: it's usually you).
Three providers: Gemini (default, free tier), Z.AI (free tier), OpenRouter (access to Claude, GPT, and others). Review history with search and filters so you can look back at past reviews.
How I built it
TypeScript pnpm monorepo. The structure looks like this:
apps/cli/ → Ink 6 launcher, bundles everything into one installable package
apps/server/ → Hono backend, AI providers via Vercel AI SDK
apps/web/ → React 19 + TanStack Router + Vite 7 + Tailwind 4
packages/ → Shared core, API client, Zod 4 schemas, UI components
The review pipeline has five steps: parse the diff, build project context, run the lenses (parallel or sequential), enrich issues with git blame and surrounding code, then deduplicate and sort.
Some decisions that shaped the project:
Result<T, E>for error handling instead of try/catch everywhere - forces you to handle both paths- React 19 Compiler for auto-memoization, no manual
useCallback/useMemo - ESM only, no CommonJS
- Security even for localhost - CORS locked to
127.0.0.1, host header validation, XML-escaping all user content in AI prompts, OS keyring for API keys - Keyboard-first navigation across the whole UI
The CLI is the distribution unit. npm install -g diffgazer and you get the server, the web app, everything. One embedded Hono server on port 3000.
I'll be honest - most of the code was generated by Claude and Codex during the hackathon. I used Diffgazer itself to review the code as I was building it, and that's where it got ironic. The tool kept finding issues in its own codebase. It proved the point better than any pitch could.
Challenges I ran into
Parallel agents are hard across providers. Running 5 AI agents at the same time sounds simple until providers start dropping connections. Z.AI couldn't handle 5 concurrent requests reliably (probably skill issue within my implementation lol). Gemini Flash handled it fine, which is one of the reasons it became the default. I added sequential mode as a fallback for providers that can't keep up.
Prompt injection in diffs. This one was fun to learn about. Someone could rename a variable or add a comment that contains instructions for the AI, and instead of reviewing the code, the AI follows those instructions. This already happened to GitHub Copilot (CVE-2025-53773). I XML-escape everything before it hits the AI and added hardening instructions to the system prompts.
"Localhost is safe" is a lie. DNS rebinding (CVE-2024-28224) lets malicious websites talk to your localhost services. This already happened to Ollama and Jupyter. A random website you visit could trigger reviews, read your diffs, or steal your API keys. So yeah, CORS and host validation even though there's no login screen. Local doesn't mean safe.
SSE session resilience. If you close your browser mid-review, the review shouldn't die. The server keeps going, buffers events, and replays them when you reconnect. Getting this right without leaking sessions or running out of memory took some back and forth.
The vibe coding trap. This was the biggest challenge and it wasn't technical. Building fast with AI during a hackathon confirmed my tool's own thesis - quality drops when you skip review. I came to a point where I want to drop that approach entirely because it didn't feel right. Moving fast feels productive until it stops feeling like anything at all. Ironic that a tool meant to catch this problem was built by falling into the same trap.
What I learned
Gemini Flash is underrated for code review. I really don't like Gemini for writing code, it's not there yet for me, but for reviewing diffs? Fast, cheap, accurate enough. The free tier genuinely covers 2-3 full reviews on large PRs per rate limit cycle. For running before a push, that's plenty.
Local tools need security too. I learned a lot about DNS rebinding, prompt injection, and why "it only runs on localhost" isn't a security strategy. At least what AI told me about it lol, but the CVEs are real.
AI-generated code needs more review, not less. The faster we generate code, the more important it becomes to actually read it. Diffgazer exists because of this realization.
Know your code. No need to race. This is the most important thing I took from this project. Not the tech, not the architecture, this.
Building Diffgazer during the hackathon with AI at full speed proved to me that velocity is a trap. The code was shipping fast, features were landing, it felt productive. But the quality wasn't there. I couldn't say with confidence that I reviewed each line. And that's the worst possible thing you can do in the age of AI - just go with the "vibe" and skip the review. I caught myself doing exactly what my tool was supposed to prevent.
It confirmed something I felt but didn't want to admit: the engineering behind AI-generated code is way behind in the development cycle. We're optimizing for speed and losing quality along the way. And at some point it stops feeling like engineering and starts feeling like... nothing. You're not learning, you're not improving, you're just shipping.
So I'm choosing the other path. Ship less. Preserve my programming skills. Learn new things. Avoid burnout. Enjoy the work again. These things matter more than velocity, more than feature counts, more than how fast I can push to main.
I want to be proud of what I build. And I want to actually understand it.
Know your code. No need to race.
What's next for Diffgazer
Stabilization phase until April 2026 - bug fixes, quality improvements, cleaning up hackathon debt. No breaking changes. The codebase needs to get healthier before it gets bigger.
After that:
- Local providers - Ollama, LM Studio, so your data never leaves your machine at all
- UI/UX quality pass - the interface needs love
- More providers - Anthropic and OpenAI direct, no middleman
- GitHub Actions - run Diffgazer in CI to review PRs automatically
- Headless CLI - reviews without the browser, results in terminal or JSON
The pace will be intentionally slower. I don't want this to be another vibecoded slop that ships fast and breaks faster. I want to be proud of it and deliver something useful.
No ETA on any of this. I'd rather ship something solid than hit a date.
Built With
- hono
- ink
- npm
- pnpm
- react
- tailwind
- tanstack
- typescript
- vite
- vitest
- zod
Log in or sign up for Devpost to join the conversation.