AI PR Code Reviewer - Backend

💡 Inspiration

Code reviews are crucial but time-consuming. As developers, we've all experienced:

The bottleneck: Waiting days for senior engineers to review PRs
The overwhelm: Reviewing a 50-file PR and missing critical security issues
The inconsistency: Different reviewers catching different things

We wanted to build an AI assistant that could provide instant, comprehensive code reviews—democratizing access to senior-level feedback for developers at all levels. The goal wasn't to replace human reviewers, but to augment them by catching common issues and providing a structured starting point.

🎯 What It Does

Our backend accepts a GitHub Pull Request URL and returns a comprehensive AI-powered code review containing:

✅ PR Summary: What changed, why it changed, and key files
✅ Prioritized Findings: Security vulnerabilities, performance issues, code quality concerns
✅ Risk Matrix: Security, performance, breaking changes, and maintainability scoring
✅ Test Plan: Suggested unit tests, integration tests, and edge cases
✅ Merge Readiness Score: Overall assessment with blockers clearly identified
✅ GitHub Integration: Post reviews directly as PR comments

🚀 How We Built It

Tech Stack Decision

We chose FastAPI (Python) for the backend because:

Async-first architecture for efficient GitHub API calls
Automatic API documentation with Swagger UI
Fast development cycle perfect for hackathon timelines
Strong typing with Pydantic for reliable data validation

Architecture

1. GitHub Service Layer

Parses PR URLs and extracts owner/repo/PR number
Fetches PR metadata, diffs, changed files, and commits via GitHub API
Handles OAuth for private repos
Posts formatted reviews back to GitHub as comments

2. Analysis Engine

Dual-mode analysis:
- AI Mode: Uses OpenAI GPT-4o-mini with structured JSON output
- Heuristics Mode: Pattern-matching fallback for offline/no-API-key scenarios
Scans for security patterns (hardcoded secrets, SQL injection, XSS)
Detects performance anti-patterns (N+1 queries, nested loops)
Identifies auth/permission changes requiring extra scrutiny

3. Data Layer (SQLAlchemy + SQLite)

AnalysisRun: Tracks each PR analysis request
AnalysisResult: Stores structured review output
User: Supports future GitHub OAuth integration

4. API Design (RESTful)

POST /api/analyze: Submit PR for analysis (returns run_id)
GET /api/runs/{id}: Check analysis status
GET /api/runs/{id}/result: Retrieve structured review
POST /api/github/post-comment: Post review to PR

The Math Behind Risk Scoring

# 1. Install dependencies
cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

# 2. Configure environment
cp .env.example .env
# (Optional) Edit .env to add OPENAI_API_KEY for AI analysis

# 3. Run the server
uvicorn app.main:app --reload --port 8000

# 4. Access interactive API docs
# Open: http://localhost:8000/docs

💻 📡 calculate merge readiness using a weighted penalty system:

$$\text{Score} = \max(0, 100 - 30n_c - 15n_h - 2n_t)$$

Where:

$n_c$ = number of critical findings
$n_h$ = number of high-severity findings
$n_t$ = total findings

This gives us a score between 0-100, with critical issues having exponentially more impact.

🧠 What We Learned

Technical Learnings

Async Python is powerful: Using asyncio.gather() to fetch PR data concurrently reduced API call time by ~60%
LLM output validation is critical: We implemented strict JSON schema validation because early tests showed GPT would occasionally output malformed JSON. Our retry logic with "fix invalid JSON" prompts improved reliability to 99%+
GitHub API quirks: Redirects (301s) aren't followed by default in httpx. Adding follow_redirects=True solved mysterious failures
Background tasks matter: For hackathon MVP, we used FastAPI's BackgroundTasks, but learned we'd need Celery + Redis for production scale

Product Learnings

Heuristics as fallback is essential: Not everyone has/wants an OpenAI key. Our pattern-matching fallback ensured the product works for everyone
Structured output > prose: Early versions returned paragraph reviews. Users wanted JSON they could integrate into other tools
Markdown formatting matters: GitHub-style markdown with emojis and checkboxes makes reviews feel native

😅 Challenges We Faced

Challenge 1: GitHub API Rate Limits

Problem: Early tests hit rate limits quickly when fetching large PRs
Solution: Implemented intelligent caching and only fetch file contents when truly needed. For diffs, we use the unified diff endpoint instead of fetching files individually

Challenge 2: Large Diffs Breaking Token Limits

Problem: Some PRs had 100+ file diffs that exceeded OpenAI's context window
Solution:

Truncate diffs to 15,000 characters
Prioritize showing changed code over unchanged context
Use file-level summaries for very large PRs

Challenge 3: Async Database Sessions in Background Tasks

Problem: SQLAlchemy sessions aren't thread-safe. Background tasks would crash trying to reuse the request's DB session
Solution: Create a fresh database session inside background tasks using the database URL

Challenge 4: Heuristic Pattern False Positives

Problem: Regex for "password" flagged variables like password_reset_token (which is fine)
Solution: Used confidence scores (0.0-1.0) and let users filter by confidence threshold

🔧 Quick Start

API Endpoints

Core Endpoints

Method	Endpoint	Description
POST	`/api/analyze`	Submit a PR for analysis
GET	`/api/runs/{run_id}`	Get analysis run status
GET	`/api/runs/{run_id}/result`	Get analysis results
GET	`/api/runs`	List recent analysis runs

GitHub Integration

Method	Endpoint	Description
POST	`/api/github/post-comment`	Post review to PR as comment
GET	`/api/github/validate-url`	Validate a PR URL

Example Usage

Analyze a PR

curl -X POST http://localhost:8000/api/analyze \
  -H "Content-Type: application/json" \
  -d '{"pr_url": "https://github.com/owner/repo/pull/123"}'

Response:

{
  "📊 run_id": 1,
  "status": "pending",
  "message": "Analysis started. Use GET /api/runs/{run_id} to check status."
}

Check Status

curl http://localhost:8000/api/runs/1

Get Results

curl http://localhost:8000/api/runs/1/result

Output Format

The analysis produces a structured JSON review:

{
  "pr_summary": {
    "what_changed": "...",
    "why_it_changed": "...",
    "key_files": ["..."]
  },
  "findings": [
    {
      "title": "Potential auth bypass in middleware",
      "severity": "high",
      "confidence": 0.74,
      "file": "api/auth/middleware.py",
      "evidence": "...",
      "recommendation": "..."
    }
  ],
  "risk_matrix": {
   ⚙️  "security": "high",
    "performance": "medium",
    "breaking_change": "low",
    "maintainability": "medium"
  },
  "test_plan": {
    "unit_tests": ["..."],
    "integration_tests": ["..."],
    "edge_cases": ["..."]
  }📁 ,
  "merge_readiness": {
    "score": 78,
    "blockers": ["..."],
    "notes": "..."
  }
}

Configuration

Variable	Description	Required
`DATABASE_URL`	Database connection string	No (defaults to SQLite)
`OPENAI_API_KEY`	OpenAI API key for AI analysis	No (uses heuristics)
`GITHUB_TOKEN`	GitHub token for private repos/posting	No
`LLM_MODEL`	OpenAI model to use	No (defaults to gpt-4o-mini)

Project Structure

backend/
├── app/
│   ├── __init__.py
│   ├── main.py           # FastAPI application
│   ├── config.py         # Settings and configuration
│   ├── models/           # SQLAlchemy models
│  🛠️  │   ├── database.py
│   │   ├── user.py
│   │   └── analysis.py
│   ├── schemas/          # Pydantic schemas
│   │   ├── analysis.py
│   │   └── github.py
│   ├── routers/          # API routes
│   │   ├── analyze.py
│   │   ├── github.py
│   │   └── health.py
│   └── services/         # Business logic
│       ├── github.py     # GitHub API integration
│       └── analyzer.py   # Analysis engine
├── requirements.txt
├──🚀  .env.example
└── README.md