RepoWhisperer

Banner

Inspiration

Every day, developers need to understand unfamiliar codebases. Tools like Claude Code offer this capability, but behind a monthly subscription. I wanted to build a free, open-source alternative that anyone can use. The goal was simple: paste a GitHub URL, and start asking questions about the code.

What it does

RepoWhisperer clones any public GitHub repository, parses and indexes every file into searchable chunks, and lets you chat with the codebase through a conversational interface. You can ask about architecture, find where specific functionality lives, generate tests, or get code explanations. Responses stream in real-time, and you can switch between four different AI models depending on your needs.

How I built it

The backend is a FastAPI application that handles repository cloning (via GitPython), file parsing, and chunking. Each file is split into contextual segments with language detection. When a user asks a question, a RAG (Retrieval-Augmented Generation) pipeline scores the indexed chunks by keyword relevance, assembles the top matches into a system prompt, and sends everything to DigitalOcean Gradient AI's Serverless Inference API. Responses stream back via Server-Sent Events.

The frontend is React with Vite and Tailwind CSS, designed with a terminal-inspired UI. The entire application is containerized with Docker and deployed on DigitalOcean App Platform.

All AI inference runs through Gradient AI's /chat/completions endpoint with four available models: Llama 3.3 70B for general code understanding, DeepSeek R1 for deep reasoning, Mistral Nemo for fast responses, and Llama 3 8B for quick queries.

Challenges I faced

Getting the RAG pipeline right was the hardest part. Sending an entire repository to an LLM is not feasible due to context limits, so the chunking strategy and relevance scoring had to be tuned to surface the right code segments for each question. Balancing chunk size (too small loses context, too large wastes tokens) took several iterations.

Deploying on App Platform also required working through environment variable configuration for API keys and getting the multi-stage Docker build to serve both the backend and frontend correctly through nginx.

What I learned

Building a RAG system from scratch gave me a much deeper understanding of how retrieval-augmented generation works in practice. I also learned how straightforward DigitalOcean's Gradient AI Serverless Inference is to integrate. The OpenAI-compatible API meant the entire multi-model setup required minimal code changes to support switching between four different models.

What's next

Better retrieval through semantic search instead of keyword-based scoring, support for private repositories, and conversation memory across sessions.

Built With

docker-apis:-digitalocean-gradient-ai-serverless-inference-other:-nginx
gitpython
javascript-frameworks:-fastapi
languages:-python
pydantic
react
server-sent-events-(sse)
tailwind-css-platforms:-digitalocean-app-platform
vite

Updates

Albert Gómez Triunfante started this project — Mar 06, 2026 08:12 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.