AI PR Code Reviewer - Backend

πŸ’‘ Inspiration

Code reviews are crucial but time-consuming. As developers, we've all experienced:

  • The bottleneck: Waiting days for senior engineers to review PRs
  • The overwhelm: Reviewing a 50-file PR and missing critical security issues
  • The inconsistency: Different reviewers catching different things

We wanted to build an AI assistant that could provide instant, comprehensive code reviewsβ€”democratizing access to senior-level feedback for developers at all levels. The goal wasn't to replace human reviewers, but to augment them by catching common issues and providing a structured starting point.

🎯 What It Does

Our backend accepts a GitHub Pull Request URL and returns a comprehensive AI-powered code review containing:

  • βœ… PR Summary: What changed, why it changed, and key files
  • βœ… Prioritized Findings: Security vulnerabilities, performance issues, code quality concerns
  • βœ… Risk Matrix: Security, performance, breaking changes, and maintainability scoring
  • βœ… Test Plan: Suggested unit tests, integration tests, and edge cases
  • βœ… Merge Readiness Score: Overall assessment with blockers clearly identified
  • βœ… GitHub Integration: Post reviews directly as PR comments

πŸš€ How We Built It

Tech Stack Decision

We chose FastAPI (Python) for the backend because:

  • Async-first architecture for efficient GitHub API calls
  • Automatic API documentation with Swagger UI
  • Fast development cycle perfect for hackathon timelines
  • Strong typing with Pydantic for reliable data validation

Architecture

1. GitHub Service Layer

  • Parses PR URLs and extracts owner/repo/PR number
  • Fetches PR metadata, diffs, changed files, and commits via GitHub API
  • Handles OAuth for private repos
  • Posts formatted reviews back to GitHub as comments

2. Analysis Engine

  • Dual-mode analysis:
    • AI Mode: Uses OpenAI GPT-4o-mini with structured JSON output
    • Heuristics Mode: Pattern-matching fallback for offline/no-API-key scenarios
  • Scans for security patterns (hardcoded secrets, SQL injection, XSS)
  • Detects performance anti-patterns (N+1 queries, nested loops)
  • Identifies auth/permission changes requiring extra scrutiny

3. Data Layer (SQLAlchemy + SQLite)

  • AnalysisRun: Tracks each PR analysis request
  • AnalysisResult: Stores structured review output
  • User: Supports future GitHub OAuth integration

4. API Design (RESTful)

  • POST /api/analyze: Submit PR for analysis (returns run_id)
  • GET /api/runs/{id}: Check analysis status
  • GET /api/runs/{id}/result: Retrieve structured review
  • POST /api/github/post-comment: Post review to PR

The Math Behind Risk Scoring

# 1. Install dependencies
cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

# 2. Configure environment
cp .env.example .env
# (Optional) Edit .env to add OPENAI_API_KEY for AI analysis

# 3. Run the server
uvicorn app.main:app --reload --port 8000

# 4. Access interactive API docs
# Open: http://localhost:8000/docs

πŸ’» πŸ“‘ calculate merge readiness using a weighted penalty system:

$$\text{Score} = \max(0, 100 - 30n_c - 15n_h - 2n_t)$$

Where:

  • $n_c$ = number of critical findings
  • $n_h$ = number of high-severity findings
  • $n_t$ = total findings

This gives us a score between 0-100, with critical issues having exponentially more impact.

🧠 What We Learned

Technical Learnings

  1. Async Python is powerful: Using asyncio.gather() to fetch PR data concurrently reduced API call time by ~60%

  2. LLM output validation is critical: We implemented strict JSON schema validation because early tests showed GPT would occasionally output malformed JSON. Our retry logic with "fix invalid JSON" prompts improved reliability to 99%+

  3. GitHub API quirks: Redirects (301s) aren't followed by default in httpx. Adding follow_redirects=True solved mysterious failures

  4. Background tasks matter: For hackathon MVP, we used FastAPI's BackgroundTasks, but learned we'd need Celery + Redis for production scale

Product Learnings

  1. Heuristics as fallback is essential: Not everyone has/wants an OpenAI key. Our pattern-matching fallback ensured the product works for everyone

  2. Structured output > prose: Early versions returned paragraph reviews. Users wanted JSON they could integrate into other tools

  3. Markdown formatting matters: GitHub-style markdown with emojis and checkboxes makes reviews feel native

πŸ˜… Challenges We Faced

Challenge 1: GitHub API Rate Limits

Problem: Early tests hit rate limits quickly when fetching large PRs
Solution: Implemented intelligent caching and only fetch file contents when truly needed. For diffs, we use the unified diff endpoint instead of fetching files individually

Challenge 2: Large Diffs Breaking Token Limits

Problem: Some PRs had 100+ file diffs that exceeded OpenAI's context window
Solution:

  • Truncate diffs to 15,000 characters
  • Prioritize showing changed code over unchanged context
  • Use file-level summaries for very large PRs

Challenge 3: Async Database Sessions in Background Tasks

Problem: SQLAlchemy sessions aren't thread-safe. Background tasks would crash trying to reuse the request's DB session
Solution: Create a fresh database session inside background tasks using the database URL

Challenge 4: Heuristic Pattern False Positives

Problem: Regex for "password" flagged variables like password_reset_token (which is fine)
Solution: Used confidence scores (0.0-1.0) and let users filter by confidence threshold

πŸ”§ Quick Start

API Endpoints

Core Endpoints

Method Endpoint Description
POST /api/analyze Submit a PR for analysis
GET /api/runs/{run_id} Get analysis run status
GET /api/runs/{run_id}/result Get analysis results
GET /api/runs List recent analysis runs

GitHub Integration

Method Endpoint Description
POST /api/github/post-comment Post review to PR as comment
GET /api/github/validate-url Validate a PR URL

Example Usage

Analyze a PR

curl -X POST http://localhost:8000/api/analyze \
  -H "Content-Type: application/json" \
  -d '{"pr_url": "https://github.com/owner/repo/pull/123"}'

Response:

{
  "πŸ“Š run_id": 1,
  "status": "pending",
  "message": "Analysis started. Use GET /api/runs/{run_id} to check status."
}

Check Status

curl http://localhost:8000/api/runs/1

Get Results

curl http://localhost:8000/api/runs/1/result

Output Format

The analysis produces a structured JSON review:

{
  "pr_summary": {
    "what_changed": "...",
    "why_it_changed": "...",
    "key_files": ["..."]
  },
  "findings": [
    {
      "title": "Potential auth bypass in middleware",
      "severity": "high",
      "confidence": 0.74,
      "file": "api/auth/middleware.py",
      "evidence": "...",
      "recommendation": "..."
    }
  ],
  "risk_matrix": {
   βš™οΈ  "security": "high",
    "performance": "medium",
    "breaking_change": "low",
    "maintainability": "medium"
  },
  "test_plan": {
    "unit_tests": ["..."],
    "integration_tests": ["..."],
    "edge_cases": ["..."]
  }πŸ“ ,
  "merge_readiness": {
    "score": 78,
    "blockers": ["..."],
    "notes": "..."
  }
}

Configuration

Variable Description Required
DATABASE_URL Database connection string No (defaults to SQLite)
OPENAI_API_KEY OpenAI API key for AI analysis No (uses heuristics)
GITHUB_TOKEN GitHub token for private repos/posting No
LLM_MODEL OpenAI model to use No (defaults to gpt-4o-mini)

Project Structure

backend/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ main.py           # FastAPI application
β”‚   β”œβ”€β”€ config.py         # Settings and configuration
β”‚   β”œβ”€β”€ models/           # SQLAlchemy models
β”‚  πŸ› οΈ  β”‚   β”œβ”€β”€ database.py
β”‚   β”‚   β”œβ”€β”€ user.py
β”‚   β”‚   └── analysis.py
β”‚   β”œβ”€β”€ schemas/          # Pydantic schemas
β”‚   β”‚   β”œβ”€β”€ analysis.py
β”‚   β”‚   └── github.py
β”‚   β”œβ”€β”€ routers/          # API routes
β”‚   β”‚   β”œβ”€β”€ analyze.py
β”‚   β”‚   β”œβ”€β”€ github.py
β”‚   β”‚   └── health.py
β”‚   └── services/         # Business logic
β”‚       β”œβ”€β”€ github.py     # GitHub API integration
β”‚       └── analyzer.py   # Analysis engine
β”œβ”€β”€ requirements.txt
β”œβ”€β”€πŸš€  .env.example
└── README.md

Development

Run tests

pytest

πŸŽ“ Key Takeaways

Building this project taught us that AI tooling is most powerful when it augments, not replaces. Our system:

  • Provides instant feedback but doesn't block human review
  • Catches common patterns but understands context matters
  • Scales junior developers' capabilities without replacing senior expertise

The future of code review isn't AI vs humansβ€”it's AI + humans working together to ship better code faster.


Built with ❀️ for ColorStack Winter Hackathon 2025

Built With

  • fastapi-(python)
  • github-api/oauth
  • openai-gpt-4o-mini
  • openaiapi
  • pydantic
  • python
  • react
  • sqlalchemy-with-sqlite/postgresql
  • tailwindcss
  • uvicorn
Share this project:

Updates