Inspiration
Every day, developers need to understand unfamiliar codebases. Tools like Claude Code offer this capability, but behind a monthly subscription. I wanted to build a free, open-source alternative that anyone can use. The goal was simple: paste a GitHub URL, and start asking questions about the code.
What it does
RepoWhisperer clones any public GitHub repository, parses and indexes every file into searchable chunks, and lets you chat with the codebase through a conversational interface. You can ask about architecture, find where specific functionality lives, generate tests, or get code explanations. Responses stream in real-time, and you can switch between four different AI models depending on your needs.
How I built it
The backend is a FastAPI application that handles repository cloning (via GitPython), file parsing, and chunking. Each file is split into contextual segments with language detection. When a user asks a question, a RAG (Retrieval-Augmented Generation) pipeline scores the indexed chunks by keyword relevance, assembles the top matches into a system prompt, and sends everything to DigitalOcean Gradient AI's Serverless Inference API. Responses stream back via Server-Sent Events.
The frontend is React with Vite and Tailwind CSS, designed with a terminal-inspired UI. The entire application is containerized with Docker and deployed on DigitalOcean App Platform.
All AI inference runs through Gradient AI's /chat/completions endpoint with four available models: Llama 3.3 70B for general code understanding, DeepSeek R1 for deep reasoning, Mistral Nemo for fast responses, and Llama 3 8B for quick queries.
Challenges I faced
Getting the RAG pipeline right was the hardest part. Sending an entire repository to an LLM is not feasible due to context limits, so the chunking strategy and relevance scoring had to be tuned to surface the right code segments for each question. Balancing chunk size (too small loses context, too large wastes tokens) took several iterations.
Deploying on App Platform also required working through environment variable configuration for API keys and getting the multi-stage Docker build to serve both the backend and frontend correctly through nginx.
What I learned
Building a RAG system from scratch gave me a much deeper understanding of how retrieval-augmented generation works in practice. I also learned how straightforward DigitalOcean's Gradient AI Serverless Inference is to integrate. The OpenAI-compatible API meant the entire multi-model setup required minimal code changes to support switching between four different models.
What's next
Better retrieval through semantic search instead of keyword-based scoring, support for private repositories, and conversation memory across sessions.
Built With
- docker-apis:-digitalocean-gradient-ai-serverless-inference-other:-nginx
- gitpython
- javascript-frameworks:-fastapi
- languages:-python
- pydantic
- react
- server-sent-events-(sse)
- tailwind-css-platforms:-digitalocean-app-platform
- vite
Log in or sign up for Devpost to join the conversation.