Inspiration

I got overwhelmed trying to understand complex repositories like LangChain's weekly feature releases. The usual approach sucked — 50 browser tabs, copy-pasting functions into ChatGPT, and still being confused about how anything worked together.

I wanted something that could understand the entire repository at once, like having a senior developer who already read through everything. So I built DevAgent — now I just paste a GitHub URL and start asking questions.

What it does

DevAgent turns any public GitHub repository into your personal code assistant:

Core features:

  • Analyzes entire repo structure and tech stack
  • Answers questions in plain English
  • Spots bugs and security vulnerabilities
  • Generates smart project-specific questions
  • Explains confusing functions

Game-changer (web search):

When you ask deployment/setup questions, it grabs current web information:

  • "How do I deploy this to AWS?" → Current AWS docs
  • "Latest security practices for Node.js?" → Recent security guides
  • "How to set up CI/CD?" → Current GitHub Actions tutorials

How I built it

The pipeline:

  • Download GitHub repos via ZIP
  • Smart chunking into meaningful segments
  • Vector embedding with SentenceTransformers
  • Intelligent routing (code vs web search vs security)
  • Background caching for instant responses

Key tech choices:

  • Streamlit for frontend (quick, decent-looking)
  • Kimi LLM with web search — this was the breakthrough that made everything possible
  • TiDB for vector storage — learned this distributed SQL database from scratch, game-changer for semantic search at scale
  • Modal for serverless (no infrastructure headaches)

What took forever:

  • TiDB's vector search optimization for massive repos
  • Kimi's web search integration with code analysis
  • Performance tuning for sub-second responses
  • Smart routing between information sources

Challenges I ran into

TiDB vector search was brutal initially. Vector databases are completely different from regular SQL. Spent countless hours figuring out indexing strategies for 10,000+ file repositories. But TiDB's distributed architecture and HTAP capabilities made it worth the learning curve.

Modal serverless packaging had edge cases that only appeared during deployment. Container configuration broke in unexpected ways.

Integrating Kimi LLM's web search with local code analysis was surprisingly complex. Building a routing system that intelligently decides between "analyze this code" vs "search for current practices" took multiple iterations.

Accomplishments that I'm proud of

  • Handles 10,000+ file repositories successfully
  • Sub-100ms responses for cached questions through TiDB's optimized vector search
  • Smart routing that knows when you need current web info vs code analysis
  • Kimi LLM's web search integration provides genuinely current information
  • Built entirely serverless with auto-scaling

The real win: people can understand complex codebases by just asking questions instead of spending days lost in file structures.

What I learned

Technical breakthroughs:

  • TiDB's distributed vector database architecture at scale
  • Kimi LLM's capabilities for combining code analysis with real-time web search
  • Serverless AI application design that's cost-effective
  • Multi-layer caching strategies for AI tools

Key insights:

  • Users prioritize response speed over perfect accuracy
  • Kimi's web search dramatically increases usefulness beyond pure code analysis
  • TiDB's hybrid HTAP architecture perfect for this use case
  • Simple interfaces work better than feature-heavy ones

What's next for DevAgent

Short-term:

  • Private repo support with GitHub auth
  • User bookmarks and saved queries
  • Enhanced visualizations with dependency graphs
  • Mobile-responsive design

Long-term vision:

  • VS Code extension (leveraging TiDB's fast queries)
  • Team collaboration features
  • AI code generation using Kimi LLM's advanced reasoning
  • Enterprise features with TiDB's enterprise-grade security

The goal is making complex codebases accessible to anyone curious enough to ask questions. TiDB handles the scale, Kimi provides the intelligence — together they're transforming how people learn from code.

Built With

Share this project:

Updates