108: GPTSense

Empty website
GPTSense detecting human written text
History of previously checked responses

Inspiration

When we were first coming up with this idea, we thought back to when plagiarism was a rampant issue in schools. A lot of us have heard stories about the numerous people that got caught for plagiarism in the past, but as a current high school student, it hasn't been nearly as common of an occurrence, so this brings up the question of what changed and how we could possibly apply it to the current growth of AI as a method of cheating. Something that we noticed was that now that there are tools like Turnitin and Google Plagiarism Check, most people don’t even attempt to plagiarize out of fear of being caught. Our goal was to create a tool that helps empower teachers and institutions to better detect the use of generative AI and to discourage the abuse of AI.

What it does

This is a website that allows you to input text (preferably between 5-30) lines and shows the percentage chance that it was AI-generated. It uses a text classification model built with Tensorflow, or in other words, using an AI model to detect the use of AI models.

How we built it

This project can be split into two major sections. In the first section, we worked on collecting as much data as possible in a short duration and separating it into training, validation, and test sets. During the second half of the competitions, we worked on creating a Tensorflow model that utilized the training data we had collected to create a text classification neural network that can detect the use of ChatGPT. While this was happening, one of our team members worked continuously on the front end, making sure that the website UX experience was intuitive and visually pleasing.

Challenges we ran into

Given such a tight time constraint, we weren't able to collect as much data as we had originally hoped. We were strictly limited by the OpenAI API rate limits which meant that we could only generate so many ChatGPT responses every minute. Our dataset of human ranges was also quite narrow and consisted of mostly blog posts. This means that it is possible to trick our detector with extremely formal responses.

Accomplishments that we're proud of

We are extremely proud of the accuracy of GPTSense. When given responses directly from ChatGPT, it is SHOCKINGLY accurate, correctly detecting AI-generated text well beyond 50% of the time. This proved that our model was not just randomly guessing and could actually tell the difference between human and AI-generated responses. We were also able to reach a 99% accuracy on our training and testing datasets, showing that GPTSense can likely be expanded on and improved in the future with more data.

What we learned

Something that consistently troubled us during this hackathon was establishing a means of communication between the Javascript frontend and the Python backend. At first, we had decided to exchange data via updating hidden tags in the html file, but we soon realized that this was a TERRIBLE method. The issue was that every time the front end and back end needed to communicate, there had to be a page refresh. We were able to overcome this issue by implementing a more proper system of GET and POST requests that resolved the issue of constant refreshes.

What's next for GPTSense

As mentioned above, the future for GPTSense is largely dependent on our ability to collect a more diverse range of data that will allow GPTSense to properly classify different writing styles. The biggest downfall of GPTSense right now is that it will often mistake formal writing as AI-generated. We foresee a possible solution of scraping news websites and scientific articles to teach our model how to differentiate between formal writing and AI.