RepoLens: The Architecture-First AI Agent 🔍

💡 Inspiration

As a software engineering student, I’ve often felt the "onboarding paralysis"—that overwhelming feeling when you clone a repository for the first time and realize there are hundreds of files with no clear map. Standard AI tools help with snippets, but they often lose the "big picture." They can tell you what a function does, but they struggle to explain how the entire system breathes.

I was inspired to build RepoLens to solve this. I wanted to create an agent that doesn't just read code, but understands the architecture, visualizes it, and acts as a senior partner who has already memorized every line of the repo.

🏗️ How I Built It

RepoLens is designed as a robust, containerized monorepo built for the AWS Amazon Nova Hackathon.

  • The Brain: I used Amazon Nova 2 Lite because its 1M token context window is a game-changer. It allowed me to move away from complex RAG (Retrieval-Augmented Generation) for small-to-medium codebases, feeding the entire repository into the model to ensure 100% grounded reasoning.
  • The Logic: I implemented LangGraph to manage the agentic state. This isn't just a chatbot; it's a multi-tool agent that decides whether it needs to generate a Mermaid.js diagram, run a security audit tool, or synthesize an architectural overview.
  • The Stack: The backend is powered by FastAPI for high-performance streaming via Server-Sent Events (SSE). The frontend is a modern Next.js interface styled with Tailwind CSS.

🧠 What I Learned

This project was a deep dive into Agentic Design Patterns. I learned that the effectiveness of an AI isn't just about the LLM; it's about the State Management.

  • I mastered how to structure "tools" within LangGraph so the agent can autonomously choose the right output format.
  • I gained a much deeper understanding of Token Management and how to pack a multi-file repository into a structured prompt that a model can navigate efficiently.

🚧 Challenges I Faced

The biggest challenge was Repository Ingestion. Not all files in a repo are useful—binaries, node_modules, and .git folders just add noise. I had to build a custom "token packer" logic in the backend to intelligently filter and hierarchy-map the files before they ever reached the LLM.

Another hurdle was Live Diagram Rendering. Ensuring the agent consistently produced valid Mermaid.js syntax that the Next.js frontend could render in real-time required strict prompt engineering and iterative refinement of the agent's tool definitions.

🚀 Next Step

Would you like me to generate a specific "Technical Accomplishments" list or a "Future Roadmap" section to add to this?

Built With

Share this project:

Updates