We wanted to stop people from being able to shift around sentences and words on English essays and get away with it.
What it does
It uses natural language processing to detect similarities paragraph by paragraph, matches up the paragraphs, and returns a similarity score.
How we built it
We built it with minimal API calls but did utilize machine learning by making calls to a special library known as gensim. Gensim vectorizes words so that they may be compared. We used this to detect similarities between words and paragraphs.
Challenges we ran into
Many, many, many. We had a very difficult time detecting the similarities in the words because the algorithm to do so was very discrete math heavy. We ran into problems left and right with comparative analyses.
Accomplishments that we're proud of
The entire thing. We believe that this was a much-needed utility and that it will really be useful in the academic world to help prevent people from cheating on their exams and term papers.
What we learned
We learned a lot about natural language processing and machine learning, lol.
What's next for PaperScan
Hopefully, we will submit it to the George Mason Board in attempt to try and catch cheaters. We are hoping that it may one day be implemented onto blackboard automatically, so that friends may not shift around sentences and words to try and fool the teacher.