Inspiration

We were inspired by the growing need to manage and understand massive, complex software codebases. This led us to the story of "Alex," a lead engineer at a fast-scaling startup. Alex's days are a blur of feature development, urgent bug fixes, and the most draining task of all: slogging through pull requests. The team is stretched thin, facing burnout, and this operational drag jeopardizes their market opportunity.

With the advent of large context models like Gemini 2.5 Pro and Google’s Agent Development Kit (ADK), we envisioned a solution: a self-orchestrated multi-agent system that could handle this tedious work automatically, giving engineers like Alex back their time to innovate.

What it does

Code Analyst Agent is a multi-agent system that automatically performs a comprehensive audit of any public or private GitHub repository. The process is fully automated:

  1. Clones a GitHub Repository: The system starts by cloning the target codebase.
  2. Orchestrates Specialized Agents: A primary Orchestrator Agent, built with Google's ADK, coordinates the workflow and assigns tasks to specialized sub-agents.
  3. Analyzes Code Structure: A Code Analyst Agent parses the entire codebase, building an Abstract Syntax Tree (AST) to map its structure and dependencies.
  4. Scans for Security Vulnerabilities: A dedicated Security Agent scans for known CVEs by cross-referencing libraries and code patterns against Google's BigQuery vulnerability databases.
  5. Detects Performance Issues: A Performance Agent systematically detects inefficiencies, such as deeply nested loops, large files, and other anti-patterns that could slow down the application.
  6. Leverages Large Context Understanding: The entire process is powered by Gemini 2.5 Pro, which processes the large, complex context of the entire codebase without losing critical details or understanding.
  7. Generates Actionable Reports: Finally, the system generates a rich JSON report and a polished HTML dashboard that presents all findings, insights, and actionable recommendations in a clean, developer-friendly format.

How we built it

Our technical foundation combines Google's cutting-edge AI tools with a robust, scalable architecture.

  • Agent Framework: We used Google’s Agent Development Kit (ADK) to build, test, and orchestrate our multi-agent system.
  • AI Model: Gemini 2.5 Pro was the core of our system, used for its massive context window and powerful reasoning capabilities, allowing us to analyze entire codebases in a single pass.
  • Custom Tools: We developed a custom CodeUnderstandingTool for AST parsing and a BigQueryTool for looking up CVEs efficiently.
  • API & CLI: The backend is served by a Flask API, while a Python CLI provides a user-friendly interface for local development and execution.
  • Reporting: We used Jinja2 to dynamically generate the final HTML dashboard from the analysis results.
  • Deployment & Infrastructure: The entire application is containerized using Docker and built for flexible deployment on Google Cloud Run for serverless execution, Google Kubernetes Engine (GKE) for scaled workloads, and Vertex AI Workbench for development and testing.

Challenges we ran into

  • Managing huge codebases without running into memory or timeout issues during analysis.
  • Balancing multiple agent tasks in parallel to ensure efficiency without making redundant API calls to the LLM.
  • Designing a clean, intuitive, and developer-friendly report format that clearly communicates complex findings.
  • Debugging the orchestration layer between local development, the Docker container, and the final cloud deployment.

Accomplishments that we're proud of

  • Building a robust, modular, and fully autonomous multi-agent system from scratch using Google's ADK.
  • Successfully demonstrating the competitive advantage of Gemini 2.5 Pro's large context window for complex code analysis tasks.
  • Engineering the tool to be easy to run both locally via a simple CLI and as a scalable service in the cloud.
  • Delivering clear, valuable, and actionable insights that can help real engineering teams improve their code quality.

What we learned

  • How to effectively leverage Google's ADK for building and managing real-world multi-agent workflows.
  • Best practices for prompting and structuring data for large context models like Gemini 2.5 Pro to perform specific code-related tasks.
  • Techniques for optimizing the cost and latency of LLM calls, especially when dealing with parallel agent requests.
  • The importance of building a unified backend logic that can serve both a user-friendly CLI and a web API.

What's next for Code Analyst Agent

Our vision is to evolve this tool into an indispensable part of the development lifecycle. Our immediate next steps include:

  • Extend Language Support: Add analysis capabilities for more languages, including Java, TypeScript, and Go.
  • Add Dynamic Checks: Incorporate runtime profiling and dynamic performance checks, not just static analysis.
  • Provide Automated Fixes: Move beyond diagnostics to provide automated, AI-generated code fixes that developers can review and apply directly.
  • CI/CD Integration: Integrate directly into CI/CD pipelines (e.g., via GitHub Actions) for continuous, automated code auditing on every commit.
  • Team Dashboards: Build collaborative team dashboards with historical trend analysis and project-level insights.

Built With

  • bigquery
  • docker
  • flask
  • gemini-2.5-pro
  • github-actions
  • github-api
  • google-agent-development-kit-(adk)
  • google-cloud-run
  • google-kubernetes-engine-(gke)
  • google-vertex-ai-workbench
  • gunicorn
  • jinja
  • python
+ 20 more
Share this project:

Updates