Inspiration
Every product team has the same frustrating loop: a PM spots a visual bug, screenshots it, posts it in Slack, a developer context-switches to investigate, files a ticket, eventually pushes a fix days later. We asked ourselves — what if a non-technical person could just show the bug and get a fix automatically? The Sentinel was born from the idea that seeing a bug should be enough to fix it. With Gemini 3's multimodal intelligence now capable of truly understanding images alongside code, we realized the technology finally exists to bridge the gap between what users see and what developers write.
What it does
The Sentinel is an autonomous visual bug detection and repair agent. Users upload a screenshot or screen recording of a UI issue (or paste a deployed URL for automatic capture), point it at a GitHub repository, and describe what looks wrong in plain English. The Sentinel then:
- Scouts the codebase to find relevant files using Gemini 3's reasoning
- Analyzes the visual evidence against the source code to pinpoint the root cause
- Generates a production-ready code fix
- Creates a Pull Request on GitHub — ready for one-click merge
- Self-corrects if CI/CD checks fail, iterating autonomously until the fix passes
No coding knowledge required. Product managers, designers, and QA testers can ship UI fixes as fast as they can spot them.
How we built it
- Backend: FastAPI (Python) with a modular service architecture — separate services for GitHub integration (PyGitHub), visual capture (Playwright), and AI analysis (Gemini 3 Flash via
google-generativeai) - AI Core: Three Gemini 3 prompting strategies — Scout Mode for intelligent file selection, Analysis Mode for multimodal visual-to-code bug detection with structured JSON output, and Self-Correction Mode for autonomous error resolution
- Auth: Dual authentication — email/password with bcrypt + JWT, and GitHub OAuth for seamless developer onboarding
- Database: PostgreSQL on Neon with SQLAlchemy async ORM
- Frontend: Next.js 14 with a dark-themed glassmorphic UI, real-time analysis progress tracking, drag-and-drop media uploads, and an animated terminal-style system status panel
- Deployment: Render (backend) + Vercel (frontend), with Neon for managed PostgreSQL
Challenges we ran into
- Playwright on Windows — asyncio's default event loop doesn't support subprocess creation, causing Playwright to crash with
NotImplementedError. We solved it by settingWindowsSelectorEventLoopPolicyand making the capture service fail gracefully - Structured AI Output — getting Gemini to consistently return valid JSON with exact file paths and complete code blocks required extensive prompt engineering and a robust fallback parser that handles markdown code fences and partial JSON
- Token Priority Chain — supporting three sources of GitHub tokens (form input, OAuth, environment) without conflicts required careful precedence logic
Accomplishments that we're proud of
- End-to-end autonomy: From screenshot to merged PR with zero human code intervention — the full "Vibe Loop" works
- Non-technical accessibility: A product manager can genuinely fix a UI bug without opening an IDE or knowing what a "component" is
- Self-healing architecture: The self-correction loop that reads CI errors and iterates on its own fix feels genuinely magical
- Production-grade auth: Full JWT auth system with protected routes
What we learned
- Gemini 3's multimodal capabilities are remarkably good at mapping visual UI issues to code , it can reason about layout, spacing, colors, and component structure from a single screenshot
- Building for non-technical users requires a fundamentally different UX mindset , every error message, loading state, and progress indicator matters more than any feature
- The gap between "AI generates code" and "AI ships code" is enormous , GitHub API integration, branch management, commit signing, and PR creation add layers of complexity that the AI part doesn't solve
- Graceful degradation is essential — every external service (Playwright, GitHub, Gemini) can fail, and the app needs to keep working
What's next for The Sentinel
- Figma-to-Code Diffing: Upload a Figma design and a deployed URL — The Sentinel identifies every pixel-level deviation and fixes them all in one PR
- Real-time Monitoring: A browser extension that continuously watches deployed sites for visual regressions and auto-creates fix PRs before users even report bugs
- Team Dashboard: Analytics showing bug trends, fix success rates, and time-to-resolution across projects
- Multi-file Fixes: Expanding beyond single-file patches to coordinated changes across components, stylesheets, and configuration files
- Design System Enforcement: Automatically detecting when new code violates an organization's design tokens and proposing corrections
Built With
- fastapi
- githubapi
- nextjs
- postgresql
- python
Log in or sign up for Devpost to join the conversation.