About the Project
Overview
PatchForge is an autonomous DevSecOps agent that identifies and remediates software vulnerabilities by reasoning over the official NVD dataset, determining safe dependency upgrades, validating them, and opening a production-ready GitHub pull request. It combines a retrieval-augmented generation (RAG) layer built over NVD JSON feeds with NVIDIA Nemotron-70B to produce grounded, verifiable remediation plans.
The full system — scanner, RAG researcher, patch generator, validator, and PR creator — was implemented from scratch in just 24 hours.
What Inspired This
Modern software depends on thousands of open-source libraries. When vulnerabilities are disclosed, teams are flooded with CVE alerts, yet fixing and validating patches remains manual, costly, and error-prone. We asked:
Can AI act as a security engineer — not only detect issues but also autonomously fix them, grounded in the official NVD data?
PatchForge was born to answer that.
What It Does
PatchForge automates the full remediation lifecycle:
- Scan repository dependency files (e.g.,
requirements.txt) for CVEs via OSV and local NVD data. - Retrieve authoritative CVE context from a local vector index built from NVD JSON feeds (RAG).
- Reason with Nemotron-70B to select the minimal, safest upgrade.
- Patch dependency files deterministically, preserving formatting and comments.
- Validate fixes in an isolated environment with
pip installand import checks. - Commit & PR: Craft a professional GitHub pull request containing the retrieved NVD context, decision rationale, and validation logs.
Each stage is autonomous and fully traceable — producing human-grade outputs with no manual input.
How We Built It
- Model: NVIDIA Nemotron-70B for structured reasoning and JSON-based decision output.
- RAG Layer: Local Chroma vector DB seeded from NVD JSON 2.0 feeds (e.g.,
data/cve-2024.json) for grounded retrieval in the demo. - Agents:
ScannerAgent → ResearcherAgent (RAG + Nemotron) → PatchGeneratorAgent → ValidatorAgent → PRCreatorAgent. - Integrations: OSV, PyPI, GitHub REST API (PyGithub).
- Validation: Uses throwaway Python virtual environments to confirm package integrity and compatibility.
- Interface: Interactive CLI built with Rich/Colorama for live feedback, step logs, and cinematic visualization.
All components and orchestration were written from scratch within 24 hours.
Key Technical Highlights
- Retrieval-Augmented Reasoning: Nemotron reasons directly from the official NVD advisory text instead of static prompts, making every decision explainable.
- ReAct Loop: Failed validations trigger Nemotron-guided refinements until the patch passes all checks.
- Deterministic Patching: Only targeted lines are modified; comments and formatting are preserved.
- Professional PR Generation: Automatically composes a GitHub PR with vulnerability summary, CVSS score, fix reasoning, validation logs, and NVD/OSV links.
What We Learned
- Grounding model reasoning in real data (RAG) builds trust and reproducibility.
- Multi-agent coordination requires explicit JSON communication to avoid hallucinations.
- Validation and retry loops transform LLM reasoning into reliable engineering action.
Challenges
- NVD API volatility: We implemented local JSON feed fallback for reliability.
- Dependency conflicts: Some CVEs require multi-package coordination.
- Demo performance: Balancing deep reasoning with hackathon time limits required caching and pre-seeded datasets.
What’s Next
- Extend support to npm, Maven, and Dockerfiles.
- Train a compact fine-tuned model on historical CVE→patch data.
- Build a web dashboard for multi-repo patch visibility, risk scoring, and approval workflows.
Demo Notes for Judges
- The live demo uses a seeded local NVD JSON file (
data/cve-2024.json) for instant retrieval. - The ReAct loop demonstrates an intentional conflict → reasoning → retry → success → GitHub PR creation.
- Example artifact: a real PR opened autonomously by PatchForge during the demo.
Tech Stack
Languages: Python 3.11
Model: NVIDIA Nemotron-70B
RAG: Chroma Vector DB + sentence embeddings
APIs: OSV, PyPI, GitHub REST API
Libraries: PyGithub, python-dotenv, Rich, Colorama
Runtime: venv-based isolated validation
Repository
Built With
- chromadb
- python


Log in or sign up for Devpost to join the conversation.