🎯 Inspiration

Understanding large codebases is one of the biggest challenges developers face. We've all been thereβ€”staring at thousands of lines of code, trying to understand how everything connects. Traditional documentation quickly becomes outdated, and manually tracing dependencies is tedious and error-prone.

We built Synergy to solve this problem: What if you could instantly visualize any Python codebase and get AI-powered explanations for every component?

πŸš€ What it does

Synergy transforms Python repositories into interactive visual diagrams with four core features:

  1. Instant Analysis: Paste any GitHub URL and get a complete codebase visualization in seconds
  2. Interactive Diagrams: Pan, zoom, and click on any class or function to explore dependencies with health indicators (good βœ…, warnings ⚠️, critical πŸ”΄)
  3. AI Code Tutor: Click any node to get intelligent, context-aware explanations powered by OpenRouter's LLM APIs
  4. MCP Integration: Expose analysis capabilities to AI assistants (Claude Desktop, Antigravity) via Model Context Protocol for seamless codebase exploration

The system automatically detects classes, functions, imports, and relationships, then renders them as a beautiful Mermaid diagram. Each component is analyzed for complexity and potential issues, giving you instant insights into code health.

πŸ› οΈ How we built it

Architecture: Full-stack application with Docker containerization for seamless deployment

Backend (Python + FastAPI):

  • Custom static analysis engine that parses Python ASTs to extract code structure
  • Graph database (NetworkX) to model relationships between components
  • Mermaid renderer that converts code graphs into interactive diagrams
  • OpenRouter integration for AI-powered code explanations using state-of-the-art LLMs
  • MCP (Model Context Protocol) server exposing 10+ tools for codebase analysis:
    • analyze_codebase: Analyze any local Python project
    • search_nodes: Find classes/functions by name
    • get_node_neighbors: Explore dependencies (upstream/downstream)
    • find_path: Trace dependency chains between components
    • get_context_bundle: Retrieve source code + dependency signatures
    • scan_external_api_surface: Map all 3rd-party library usage
    • Dual transport modes: stdio (for Claude Desktop) & SSE (for web clients)
  • RESTful API with CORS support for seamless frontend communication

Frontend (React + Vite):

  • Modern ChatGPT-inspired UI with dark mode, glassmorphism, and smooth animations
  • Interactive Mermaid diagrams with custom zoom, pan, and click handlers
  • Real-time analysis with loading states, error handling, and progress indicators
  • Sidebar navigation showing analysis history with timestamps
  • Health dashboard displaying code quality metrics at a glance

DevOps & Infrastructure:

  • Docker Compose for one-command deployment (docker compose up)
  • Volume mounting for hot-reload development without rebuilding
  • Environment variable management for secure API key handling
  • Multi-stage builds optimized for production deployment

πŸ’‘ What we learned

  1. AST Parsing is Complex: Python's dynamic nature makes static analysis challenging. We had to handle edge cases like dynamic imports, decorators, metaclasses, and runtime type annotations.

  2. Mermaid Has Limits: Large diagrams (>15,000 characters) crash the renderer. We implemented size warnings, optimization strategies, and future plans for intelligent node filtering.

  3. Docker Networking Magic: Ensuring the frontend can communicate with the backend across Docker networks required careful CORS configuration and understanding of Docker's internal DNS.

  4. Python Version Hell: We hit a wall with networkx not supporting Python 3.14.1 due to breaking changes in dataclasses. Docker saved us by providing a controlled Ubuntu 24.04 + Python 3.13 environment.

  5. LLM Integration Nuances: Balancing response quality with API rate limits required implementing fallback mechanisms, mock modes for development, and graceful error handling for quota exhaustion.

  6. State Management in React: Properly propagating graphId between components for the AI tutor required careful debugging and understanding of React's state lifecycle.

πŸ—οΈ Challenges we faced

Challenge 1: Graph Rendering Performance

  • Problem: Large repositories (like Flask with 100+ classes) generated diagrams too big to render, causing browser crashes
  • Solution: Implemented size limits (15,000 chars), performance warnings, and are exploring graph simplification algorithms

Challenge 2: Node Click Detection in SVG

  • Problem: Mermaid's generated SVG structure made it difficult to attach click handlers to specific nodes
  • Solution: Custom event delegation system that parses node IDs from SVG elements and maps them back to code components

Challenge 3: Cross-Origin Resource Sharing

  • Problem: Frontend running on port 5173 couldn't call backend API on port 8000 due to CORS restrictions
  • Solution: Configured FastAPI middleware with wildcard origins for development (will restrict in production)

Challenge 4: Empty Graph IDs in AI Tutor

  • Problem: The AI tutor was receiving empty graph_id values, causing 404 errors when trying to load code
  • Solution: Added comprehensive debug logging, fixed state propagation in React, and improved error messages with available file listings

Challenge 5: Docker Hot Reload

  • Problem: Changes to code required full Docker rebuilds, slowing down development
  • Solution: Implemented volume mounts for src/ and src_web/ directories, enabling instant code updates

πŸŽ“ What's next for Synergy

Immediate Roadmap:

  • βœ… MCP server with 10+ tools for AI assistant integration (COMPLETED)
  • βœ… Fix AI tutor graph_id propagation (COMPLETED)
  • πŸ”„ Implement graph filtering to handle large codebases (hide stdlib, collapse modules)
  • πŸ”„ Add diff visualization to compare code structure across branches

Future Vision:

  • Multi-language support: Extend beyond Python to JavaScript, TypeScript, Go, Rust, and Java
  • Collaborative features: Share visualizations with teams, add annotations, and enable real-time collaboration
  • CI/CD integration: Automatic diagram generation on every commit with GitHub Actions
  • Advanced analytics: Cyclomatic complexity, code coverage overlay, and technical debt indicators
  • VS Code extension: Inline diagram generation and AI explanations directly in your editor
  • Export options: PDF, PNG, and interactive HTML exports for documentation

Long-term Goals:

  • Architecture evolution tracking: Visualize how your codebase structure changes over time
  • Dependency risk analysis: Identify circular dependencies and suggest refactoring opportunities
  • Team insights: Show which parts of the codebase are most actively developed or need attention

πŸ”¬ Research & Development Opportunities

Building on Synergy's graph-based analysis foundation, we've identified five key research directions for team expansion:

1. Big Data ML + Cross-Language Generalization

Focus: Train Graph Neural Networks (GNNs) on extracted graph data to enable multi-language support

  • Map Python AST patterns to C#, JavaScript, and Java structures using transfer learning
  • Predict missing edges in dynamically-typed languages via pattern recognition
  • Deliverable: ML models achieving 85%+ accuracy in cross-language dependency prediction
  • Impact: Enable Synergy to analyze polyglot codebases without language-specific parsers

2. Cybersecurity Threat Modeling

Focus: Extend analysis engine with vulnerability detection and blast radius visualization

  • Build SecurityScanner module that maps CVEs to specific graph nodes
  • Visualize "Vulnerability Heatmap" showing exploit propagation paths
  • Integrate with National Vulnerability Database (NVD) for real-time threat intelligence
  • Deliverable: Interactive security overlay showing dependency chains affected by known CVEs
  • Impact: Enable security teams to prioritize patching based on actual code usage

3. Enterprise Scaling + Hierarchical Agents

Focus: Scale MCP server to handle 1000+ concurrent requests with sub-linear latency

  • Implement Redis caching for frequently-queried graph patterns
  • Design hierarchical agent architecture for distributed analysis across microservices
  • Optimize graph storage using compressed sparse row (CSR) format
  • Deliverable: Benchmarks showing <100ms response time at 10,000 QPS
  • Impact: Enable enterprise adoption with SLA guarantees

4. Consultancy + Automated Auditing

Focus: Generate automated architectural audit reports for consulting engagements

  • Build "Architectural Linter" detecting anti-patterns (God classes, circular deps, tight coupling)
  • Generate PDF audit reports with executive summaries and technical deep-dives
  • Conduct user studies comparing manual code review vs. automated analysis
  • Deliverable: "State of Open Source Architecture" report aggregating patterns from 1000+ repos
  • Impact: Enable consultants to deliver data-driven recommendations at scale

5. Production SaaS Infrastructure

Focus: Transform prototype into commercial-grade multi-tenant platform

  • Implement OAuth2/SAML authentication with role-based access control
  • Build usage-based billing system integrated with Stripe
  • Deploy multi-region infrastructure with CDN distribution for diagrams
  • Achieve SOC 2 Type II compliance for enterprise customers
  • Deliverable: Production deployment serving 10,000+ users with 99.9% uptime
  • Impact: Enable commercial launch and revenue generation

Built With

Share this project:

Updates