The Problem
Reading massive, undocumented codebases to figure out what a project does or if it has basic security flaws takes too much time. I built RepoReader AI to automate this. It takes a public GitHub URL, fetches the core files, and uses Gemini 2.5 Flash to instantly generate a tech stack summary, security red flags, and a live Mermaid.js architecture diagram.
How I Built It
I used FastAPI for the backend to handle the routing and GitHub API calls. The frontend is built with Streamlit for a clean, reactive UI. The core engine is Google Gemini 2.5 Flash. I specifically fetched only the entry points and config files (like main.py and Dockerfile) instead of the whole repo to save tokens and drastically reduce processing time.
The Prompt Engineering Challenge
Getting an LLM to output valid Mermaid.js inside a strict JSON schema without breaking the Streamlit UI was the hardest part. Standard newlines or markdown backticks kept crashing the JSON parser. To fix this, I used strict prompt engineering: forced a JSON response mime type, instructed Gemini to use semicolons (;) instead of newlines for the graphs, and enforced alphanumeric Node IDs to make the rendering stable.
What's Next
For future updates, I plan to add a feature where users can input custom file names for dynamic, deep-dive analysis.
Built With
- fastapi
- google-gemini
- mermaid-js
- pydantic
- python
- streamlit
Log in or sign up for Devpost to join the conversation.