Inspiration
Testing code for edge cases has always been tedious and time-consuming. Having to manually come up with the inputs and, even worse, the correct output would always be a struggle. Ultimately, I'd miss something and end up with a hard-to-find bug later down the line. That's why I made TestCaser.
What it does
TestCaser creates test cases for Python coding problems you're trying to solve. It provides the inputs and expected output and automatically checks your code against the generated test cases. If your code fails the test case, it generates an explanation for where your code went wrong and provides suggestions on how to fix it.
The website takes in a description of what the Python code is supposed to do, the code itself (either manually typed or through a file upload), the name of the function to be tested, and any restrictions on the input if desired (eg, max length, min length, etc). Based on the description of what the code is supposed to do and the input restrictions entered by the user, test cases are generated. The user-entered code is then run, and the output is compared to the correct output as determined by ChatGPT. If the user-entered code's output conflicts with the proper output as defined by ChatGPT, the correct output is once again checked by ChatGPT, and if ChatGPT determines the code is still wrong, an explanation for what caused the code to fail the test case and suggestions to fix the code are generated. Along with information about whether the user-entered code's results, information about the run time is also reported.
How we built it
I used a Flask server in the Python backend where the test cases were generated and the code was run and checked. The frontend was built with React.js.
Challenges we ran into
Figuring out how to run the code in the backend was very challenging. I ultimately settled on using the subprocess module to run the code in the backend and added a timeout time limit of 2 seconds to ensure infinite loops wouldn't crash the backend. I also had to figure out a way to extract any errors generated from the user-entered code to show on the frontend. To accomplish this, I entered a try and except block around the function call into the user's code and printed the error, which I could access and send to the frontend.
Accomplishments that we're proud of
I'm proud of finishing most of the critical features on time. I was able to add some validation on ChatGPT's output (eg, ensuring that the inputs created by ChatGPT met the input restrictions created by the user). I was also happy with how the test cases generated covered many edge cases and helped discover actual bugs in the code that I wouldn't have found by just testing myself.
What we learned
I learned a lot about how to run other code files in Python using things like subprocess, something I hadn't ever used before.
What's next for TestCaser
I plan on adding more validation and checks on user code, supporting more languages (right now it only supports Python), and adding a login feature to allow people to save the generated test cases.
Log in or sign up for Devpost to join the conversation.