The Sentinel

Dashboard
Home screen

Inspiration

Every product team has the same frustrating loop: a PM spots a visual bug, screenshots it, posts it in Slack, a developer context-switches to investigate, files a ticket, eventually pushes a fix days later. We asked ourselves — what if a non-technical person could just show the bug and get a fix automatically? The Sentinel was born from the idea that seeing a bug should be enough to fix it. With Gemini 3's multimodal intelligence now capable of truly understanding images alongside code, we realized the technology finally exists to bridge the gap between what users see and what developers write.

What it does

The Sentinel is an autonomous visual bug detection and repair agent. Users upload a screenshot or screen recording of a UI issue (or paste a deployed URL for automatic capture), point it at a GitHub repository, and describe what looks wrong in plain English. The Sentinel then:

Scouts the codebase to find relevant files using Gemini 3's reasoning
Analyzes the visual evidence against the source code to pinpoint the root cause
Generates a production-ready code fix
Creates a Pull Request on GitHub — ready for one-click merge
Self-corrects if CI/CD checks fail, iterating autonomously until the fix passes

No coding knowledge required. Product managers, designers, and QA testers can ship UI fixes as fast as they can spot them.

How we built it

Backend: FastAPI (Python) with a modular service architecture — separate services for GitHub integration (PyGitHub), visual capture (Playwright), and AI analysis (Gemini 3 Flash via google-generativeai)
AI Core: Three Gemini 3 prompting strategies — Scout Mode for intelligent file selection, Analysis Mode for multimodal visual-to-code bug detection with structured JSON output, and Self-Correction Mode for autonomous error resolution
Auth: Dual authentication — email/password with bcrypt + JWT, and GitHub OAuth for seamless developer onboarding
Database: PostgreSQL on Neon with SQLAlchemy async ORM
Frontend: Next.js 14 with a dark-themed glassmorphic UI, real-time analysis progress tracking, drag-and-drop media uploads, and an animated terminal-style system status panel
Deployment: Render (backend) + Vercel (frontend), with Neon for managed PostgreSQL

Challenges we ran into

Playwright on Windows — asyncio's default event loop doesn't support subprocess creation, causing Playwright to crash with NotImplementedError. We solved it by setting WindowsSelectorEventLoopPolicy and making the capture service fail gracefully
Structured AI Output — getting Gemini to consistently return valid JSON with exact file paths and complete code blocks required extensive prompt engineering and a robust fallback parser that handles markdown code fences and partial JSON
Token Priority Chain — supporting three sources of GitHub tokens (form input, OAuth, environment) without conflicts required careful precedence logic

Accomplishments that we're proud of

End-to-end autonomy: From screenshot to merged PR with zero human code intervention — the full "Vibe Loop" works
Non-technical accessibility: A product manager can genuinely fix a UI bug without opening an IDE or knowing what a "component" is
Self-healing architecture: The self-correction loop that reads CI errors and iterates on its own fix feels genuinely magical
Production-grade auth: Full JWT auth system with protected routes

What we learned

Gemini 3's multimodal capabilities are remarkably good at mapping visual UI issues to code , it can reason about layout, spacing, colors, and component structure from a single screenshot
Building for non-technical users requires a fundamentally different UX mindset , every error message, loading state, and progress indicator matters more than any feature
The gap between "AI generates code" and "AI ships code" is enormous , GitHub API integration, branch management, commit signing, and PR creation add layers of complexity that the AI part doesn't solve
Graceful degradation is essential — every external service (Playwright, GitHub, Gemini) can fail, and the app needs to keep working

What's next for The Sentinel

Figma-to-Code Diffing: Upload a Figma design and a deployed URL — The Sentinel identifies every pixel-level deviation and fixes them all in one PR
Real-time Monitoring: A browser extension that continuously watches deployed sites for visual regressions and auto-creates fix PRs before users even report bugs
Team Dashboard: Analytics showing bug trends, fix success rates, and time-to-resolution across projects
Multi-file Fixes: Expanding beyond single-file patches to coordinated changes across components, stylesheets, and configuration files
Design System Enforcement: Automatically detecting when new code violates an organization's design tokens and proposing corrections

Built With

fastapi
githubapi
nextjs
postgresql
python

Updates

Umer Zubair started this project — Feb 09, 2026 07:54 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.