Inspiration

Imagine you’re a developer joining a new team , or an open-source enthusiast looking to contribute to open . You’re ready to code, but instead, you spend the first three days just clicking through folders. We’ve all been there: staring at a sea of Python files, trying to guess how the data flows or where the entry point even is. Developers spend 70% of their time just trying to understand existing codebases rather than actually building new features. This isn't just a waste of time; it's a barrier to innovation.

What it does

"That is why we have created hAI-Buddy. hAI-Buddy is a codebase explorer that functions as an 'architectural mapper.’ It doesn't just read your code; it understands the relationships within it, building a live map and acting as an expert guide for any developer entering a new ecosystem."

How we built it

We combined cutting-edge technologies to deliver a robust and user-friendly platform:

Frontend Development:

Built with React, ensuring a responsive and intuitive user interface that works seamlessly across devices. Backend Processing:

Powered by FastAPI, which provides a fast, reliable framework for managing user requests, processing data, and interfacing with the AI components. Agent Component:

Leverages a fine-tuned, ultra-fast AI model from Google’s Gemini Flash family, purpose-built for conversational interactions, to intelligently read and summarize code. Optimized to deliver natural, context-aware reasoning, it explains complex codebases in simple, human-like language; making it feel less like documentation and more like a conversation with a knowledgeable teammate. Graph Generation:

Integrated the open-source Python library NetworkX to generate an interactive “web” of file relationships, visually mapping how different parts of the codebase connect. This creates an intuitive and engaging experience that mirrors real dependency flows within the project, making complex architectures easier to explore and understand.

Challenges we ran into

Developing hAI-buddy.tech involved overcoming several complex challenges:

Context Window Management:

Large-scale repositories often exceed standard LLM token limits. We implemented Context Pruning to strip noise (boilerplate, non-functional assets) and focus the model on the most important parts so it wouldn't get confused. Data Sanitization:

To prevent parsing failures, we built a robust pre-processing pipeline to handle non-UTF-8 characters and corrupted artifacts, ensuring Zero-Discrepancy Parsing. Latency Reduction:

By fetching only necessary metadata, we reduced repository mapping time for mid-sized projects to under 30 seconds. Technical Integration:

Combining various technologies (React, FastAPI, GeminiAPI, and NetworkX) into a cohesive system presented its own set of integration challenges.

Accomplishments that we're proud of

Optimized and Scalable Platform: Developed a system that not only is optimized but also scales efficiently to serve a growing number of users.

User-Centric Design: Successfully created an interface that is both accessible and intuitive, ensuring that users can easily engage with the platform.

Advanced AI Capabilities: Leveraged Google’s advanced LLM models to build an intelligent agent capable of understanding and responding to complex queries in natural language. This achievement enabled a seamless, conversational experience; providing users with clear, context-aware guidance that feels intuitive and supportive.

Visual Proof of Concept: Transforms the codebase into more than just a tangled “hairball” of connections. The result is a structured, navigable visualization that makes even complex architectures feel organized and approachable.

High-Performance Repository Exploration: Engineered metadata extraction pipeline that selectively retrieves only the most essential information, dramatically reducing processing overhead. This combination enabled fast, efficient repository analysis while maintaining high-quality, context-rich insights.

What we learned

This project has been an invaluable learning experience, enhancing our expertise in several key areas:

Graph Theory Matters: We learned that "In-Degree" centrality is the best way to find a project's core utilities, while "Out-Degree" helps identify complex business logic controllers.

LLM Context Management: We discovered that feeding an LLM a graph's structure before the code helps it answer architectural questions with much higher accuracy and fewer hallucinations.

Technical Integration: Learned how to seamlessly integrate diverse technologies into a unified platform, ensuring reliability and performance.

What's next for hAI Buddy

Looking forward, we are excited to expand and enhance the platform:

Interactive D3.js Graphs: Moving from static images to a fully interactive, zoomable, and clickable node map in the browser.

Multi-Language Support: Expanding our AST parsers beyond Python to include TypeScript, Go, and Rust.

Auto-Refactor Suggestions: Using the dependency map to identify "circular dependencies" or "spaghetti code" and suggesting architectural improvements.

PR Reviews: Integrating hAI-buddy into GitHub Actions to explain to reviewers how a specific PR changes the system's overall topology.

Our goal is to remove the "fear of the unknown" when opening a new project, and we remain committed to expanding hAI-buddy to support every developer's journey.

Built With

Share this project:

Updates