CS220-bot

Inspiration

People be writing trash code. In CS 220 here at UMass, we try to fix that! A whole class dedicated to programming methodology where you painstakingly have to avoid obscure coding mistakes that you're not even sure are mistakes. All you can think about while submitting the assignments is whether or not you missed some arcane paradigm that was mentioned in a footnote of a programming book from 1980. The whole time you're wondering, "How the hell do I get better at this??" Well, what better way than to be aggressively insulted! Everyone knows errors like missing semicolons, syntax errors, and common issues like that, but how often does someone point out something like repeated code (blasphemous), high cyclomatic complexity, and other harder to notice smells and issues? Our motivation for this project was to provide a tool to do just that... but with a bit of flair and sarcasm to keep it interesting

What it does

The 220bot passive-aggressively points out common code smells. It does this by parsing through the program's Abstract Syntax Tree and using tree parsing algorithms to find patterns and subtrees that arise from bad coding practices. It highlights the exact line that's responsible and shows a passive-aggressively insulting message for the issue. It works dynamically as you code so you're constantly reminded of how bad you are. It's available as a VSCode extension so it's super easy to install and use straight out of the box! Here's a comprehensive list of code smells and errors it detects:

High Cyclomatic Complexity
Dead Code
Deep Nesting
Duplicate Code
Magic Numbers
Long Functions
Print Statements
Functions with too many parameters
Comparing directly with True or False (Ex: "if a == True" instead of "if a")
Lack of context management (Ex: opening files outside of a "with" statement)
Non-specific Exception Handling (using a general "except" statement instead of specifying the exact exception)
Unused variables/imports
Cyclic imports

How we built it

The VSCode frontend was built with javascript. The backend server was written in python with extensive use of the ast library to parse the abstract syntax tree of the program. We wrote the algorithms to detect all the code smells by ourselves. We then used LangChain with OpenAI APIs to generate the sarcastic code review comments

Challenges we ran into

Certain code smells like duplicate code were pretty hard to code. We also had a lot of trouble integrating the business logic with the VSCode extension because it was pretty cumbersome to test. Also testing it on 4 different systems was tough because we kept running into a variety of problems. Getting an LLM running was also very difficult as we didn't have any GPUs or powerful computing resources.

Accomplishments that we're proud of

None of us have worked with Abstract Syntax Trees or VS Code extensions before. We're proud that we were able to finish a viable product. We're also proud of the number of code smells that we were able to write the code for. It's a fairly large number given the complexity. We were also able to integrate a live LLM despite not having a GPU or a lot of computational power.

What we learned

ast parsing, vscode extension building, backend/frontend integration, LangChain, etc.

What's next for SpaghettiSniffer

Add support for languages other than python. Also do a lot of stress testing. It's also currently a little slow for larger files and directories. We can try to make it faster.