AI PR Code Reviewer - Backend
π‘ Inspiration
Code reviews are crucial but time-consuming. As developers, we've all experienced:
- The bottleneck: Waiting days for senior engineers to review PRs
- The overwhelm: Reviewing a 50-file PR and missing critical security issues
- The inconsistency: Different reviewers catching different things
We wanted to build an AI assistant that could provide instant, comprehensive code reviewsβdemocratizing access to senior-level feedback for developers at all levels. The goal wasn't to replace human reviewers, but to augment them by catching common issues and providing a structured starting point.
π― What It Does
Our backend accepts a GitHub Pull Request URL and returns a comprehensive AI-powered code review containing:
- β PR Summary: What changed, why it changed, and key files
- β Prioritized Findings: Security vulnerabilities, performance issues, code quality concerns
- β Risk Matrix: Security, performance, breaking changes, and maintainability scoring
- β Test Plan: Suggested unit tests, integration tests, and edge cases
- β Merge Readiness Score: Overall assessment with blockers clearly identified
- β GitHub Integration: Post reviews directly as PR comments
π How We Built It
Tech Stack Decision
We chose FastAPI (Python) for the backend because:
- Async-first architecture for efficient GitHub API calls
- Automatic API documentation with Swagger UI
- Fast development cycle perfect for hackathon timelines
- Strong typing with Pydantic for reliable data validation
Architecture
1. GitHub Service Layer
- Parses PR URLs and extracts owner/repo/PR number
- Fetches PR metadata, diffs, changed files, and commits via GitHub API
- Handles OAuth for private repos
- Posts formatted reviews back to GitHub as comments
2. Analysis Engine
- Dual-mode analysis:
- AI Mode: Uses OpenAI GPT-4o-mini with structured JSON output
- Heuristics Mode: Pattern-matching fallback for offline/no-API-key scenarios
- Scans for security patterns (hardcoded secrets, SQL injection, XSS)
- Detects performance anti-patterns (N+1 queries, nested loops)
- Identifies auth/permission changes requiring extra scrutiny
3. Data Layer (SQLAlchemy + SQLite)
AnalysisRun: Tracks each PR analysis requestAnalysisResult: Stores structured review outputUser: Supports future GitHub OAuth integration
4. API Design (RESTful)
POST /api/analyze: Submit PR for analysis (returnsrun_id)GET /api/runs/{id}: Check analysis statusGET /api/runs/{id}/result: Retrieve structured reviewPOST /api/github/post-comment: Post review to PR
The Math Behind Risk Scoring
# 1. Install dependencies
cd backend
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
# 2. Configure environment
cp .env.example .env
# (Optional) Edit .env to add OPENAI_API_KEY for AI analysis
# 3. Run the server
uvicorn app.main:app --reload --port 8000
# 4. Access interactive API docs
# Open: http://localhost:8000/docs
π» π‘ calculate merge readiness using a weighted penalty system:
$$\text{Score} = \max(0, 100 - 30n_c - 15n_h - 2n_t)$$
Where:
- $n_c$ = number of critical findings
- $n_h$ = number of high-severity findings
- $n_t$ = total findings
This gives us a score between 0-100, with critical issues having exponentially more impact.
π§ What We Learned
Technical Learnings
Async Python is powerful: Using
asyncio.gather()to fetch PR data concurrently reduced API call time by ~60%LLM output validation is critical: We implemented strict JSON schema validation because early tests showed GPT would occasionally output malformed JSON. Our retry logic with "fix invalid JSON" prompts improved reliability to 99%+
GitHub API quirks: Redirects (301s) aren't followed by default in httpx. Adding
follow_redirects=Truesolved mysterious failuresBackground tasks matter: For hackathon MVP, we used FastAPI's
BackgroundTasks, but learned we'd need Celery + Redis for production scale
Product Learnings
Heuristics as fallback is essential: Not everyone has/wants an OpenAI key. Our pattern-matching fallback ensured the product works for everyone
Structured output > prose: Early versions returned paragraph reviews. Users wanted JSON they could integrate into other tools
Markdown formatting matters: GitHub-style markdown with emojis and checkboxes makes reviews feel native
π Challenges We Faced
Challenge 1: GitHub API Rate Limits
Problem: Early tests hit rate limits quickly when fetching large PRs
Solution: Implemented intelligent caching and only fetch file contents when truly needed. For diffs, we use the unified diff endpoint instead of fetching files individually
Challenge 2: Large Diffs Breaking Token Limits
Problem: Some PRs had 100+ file diffs that exceeded OpenAI's context window
Solution:
- Truncate diffs to 15,000 characters
- Prioritize showing changed code over unchanged context
- Use file-level summaries for very large PRs
Challenge 3: Async Database Sessions in Background Tasks
Problem: SQLAlchemy sessions aren't thread-safe. Background tasks would crash trying to reuse the request's DB session
Solution: Create a fresh database session inside background tasks using the database URL
Challenge 4: Heuristic Pattern False Positives
Problem: Regex for "password" flagged variables like password_reset_token (which is fine)
Solution: Used confidence scores (0.0-1.0) and let users filter by confidence threshold
π§ Quick Start
API Endpoints
Core Endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/analyze |
Submit a PR for analysis |
| GET | /api/runs/{run_id} |
Get analysis run status |
| GET | /api/runs/{run_id}/result |
Get analysis results |
| GET | /api/runs |
List recent analysis runs |
GitHub Integration
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/github/post-comment |
Post review to PR as comment |
| GET | /api/github/validate-url |
Validate a PR URL |
Example Usage
Analyze a PR
curl -X POST http://localhost:8000/api/analyze \
-H "Content-Type: application/json" \
-d '{"pr_url": "https://github.com/owner/repo/pull/123"}'
Response:
{
"π run_id": 1,
"status": "pending",
"message": "Analysis started. Use GET /api/runs/{run_id} to check status."
}
Check Status
curl http://localhost:8000/api/runs/1
Get Results
curl http://localhost:8000/api/runs/1/result
Output Format
The analysis produces a structured JSON review:
{
"pr_summary": {
"what_changed": "...",
"why_it_changed": "...",
"key_files": ["..."]
},
"findings": [
{
"title": "Potential auth bypass in middleware",
"severity": "high",
"confidence": 0.74,
"file": "api/auth/middleware.py",
"evidence": "...",
"recommendation": "..."
}
],
"risk_matrix": {
βοΈ "security": "high",
"performance": "medium",
"breaking_change": "low",
"maintainability": "medium"
},
"test_plan": {
"unit_tests": ["..."],
"integration_tests": ["..."],
"edge_cases": ["..."]
}π ,
"merge_readiness": {
"score": 78,
"blockers": ["..."],
"notes": "..."
}
}
Configuration
| Variable | Description | Required |
|---|---|---|
DATABASE_URL |
Database connection string | No (defaults to SQLite) |
OPENAI_API_KEY |
OpenAI API key for AI analysis | No (uses heuristics) |
GITHUB_TOKEN |
GitHub token for private repos/posting | No |
LLM_MODEL |
OpenAI model to use | No (defaults to gpt-4o-mini) |
Project Structure
backend/
βββ app/
β βββ __init__.py
β βββ main.py # FastAPI application
β βββ config.py # Settings and configuration
β βββ models/ # SQLAlchemy models
β π οΈ β βββ database.py
β β βββ user.py
β β βββ analysis.py
β βββ schemas/ # Pydantic schemas
β β βββ analysis.py
β β βββ github.py
β βββ routers/ # API routes
β β βββ analyze.py
β β βββ github.py
β β βββ health.py
β βββ services/ # Business logic
β βββ github.py # GitHub API integration
β βββ analyzer.py # Analysis engine
βββ requirements.txt
βββπ .env.example
βββ README.md
Development
Run tests
pytest
π Key Takeaways
Building this project taught us that AI tooling is most powerful when it augments, not replaces. Our system:
- Provides instant feedback but doesn't block human review
- Catches common patterns but understands context matters
- Scales junior developers' capabilities without replacing senior expertise
The future of code review isn't AI vs humansβit's AI + humans working together to ship better code faster.
Built with β€οΈ for ColorStack Winter Hackathon 2025


Log in or sign up for Devpost to join the conversation.