What it does
CodeVisualizer is a tool which allows developers unfamiliar with a codebase to transform it into an intuitive node-based graphic. Moreover, it further supports developers by using the power of LLMs to further describe functions, classes or even modules, as well as offering a Co-pilot assistant which is there to answer any questions.
Inspiration
Our inspiration for this project was the large challenge we have each faced in being onboarded to new software architectures. We realized that software developers spend too long in a ramp-up period, which is frustrating for developers and costly for companies.
How we built it
Our solution has 5 major steps:
Creating the graph of code blocks: The first step is to understand the structure of the codebase to eventually be able to visualize it. To do this, we used pycg library, static analysis of project structure and a lot of code to compartmentalize the relationships between code blocks (including files, classes, files and modules).
Filtering the graph (for different abstraction levels): After the graph is created, we need to filter the data based on abstraction levels for the visualization. This is important as one of our goals is to be able to provide an intuitive perspective for users at different levels of abstraction for intuitive learning. However, it proved to be a challenging step. To reorganize and filter the data, we used Depth-first search algorithms as well as lowest common ancestor algortihm to find shared root packages between functions and find out which packages use which packages. At the end, we were able to aggreggate the lower level connections to higher level ones.
Explaining the graph: After having the right structure for the relationship between code blocks, we furthered added data to add transparency about the proceses. For example, we used LLMs (powered by GPT-4-turbo) to describe the underlying code blocks in more detail, powered by memoization to cache the blocks that had been analzsed and improve speed.
Visualizing the graph: After the data is filtered and described, we can now visualize it for developers. To do this, we decided on a node-based graphic due to its interactivity and intuitiveness. To avoid collissions between nodes, and thus generate dozens of them dynamically, we used the Dagre Tree algorithm, which allowed us to lay them out automatically in a given canvas. Moreover, we used a ReactJS framework, coupled with modules such as ChakraUI, to create beautiful nodes that connected to each other in an intuitive and aesthetic design.
Supporting the end-user: Finally, we couple our visualization tools with other capabilities that keep supporting developers in real-time. Specifically, we added a co-pilot agent (powered by GPT-3.5) to our platform, which is able to answer pressing questions about the codebase. The chatbot not only takes historical chat data into consideration, but chart and code data available to users.
Challenges we ran into
Cracking the codebase: By far the biggest challenge that we faced was finding a way to understand a codebase, structure it appropriately and finally communicate it to other developers.
Creating a dynamic node chart: Moreover, creating dynamic node-charts, which are interactive and aesthethic, is not easy - it takes a lot of work to guarantee an end result that one is content with. So, finding a way to visualize our large amounts of data automatically proved challenging.
Reliability of LLMs: We found the usage of LLMs challenging for two reasons: their honesty and their performance. In terms of the first point, we had to make careful attention that LLMs were not hallucinating, as our explanations are based on their technology, which was not easy. In terms of their performance, we found their speed to be too slow at times, and had to find novel ways of speeding up their process.
Accomplishments that we're proud of
- First and foremost, we are proud of an incredible weekend together, and having a blast as a team while still pushing for a solution we were all passionate about
- We're also particularly proud of the code we created to understand and organize the structure of a codebase. It was no intuitive feat, and we were exhilirated when we finally cracked it
- Finally, we're also very happy about being able to offer a semantic analysis for users.
What we learned
Hahaha, too much. But some of the highlights:
- How to understand and structure a codebase
- How to use ReactorFlow to create visuallz appealing designs
- How to use LLMs to understand and explain code blocks to developers
- Leverage frameworks and well-known tools to speed your progress
What's next for JetBrains Coolio
Jetbrains Coolio are ready!
Log in or sign up for Devpost to join the conversation.