Inspiration

Is being done as a final project for an Algorithms and Data Structures course, and was one of the many recommendations for what to create for a final project. It seemed like a fun idea with some challenge, which intrigued me.

What it does

The program takes in a file from the user and compares it to a set of files already held within the program. The program is capable of scanning files for 25%, 50%, 75%, 100%, or whatever percentage of plagiarism is found within the file, then asks the user if they want to add the checked file into the file system for use in checking later files.

How we built it

The program is built in Python 3.11 and uses the os and MutableMapping imports to move files to and from outside directories with the program and as a base for the Map class.

Challenges we ran into

Being a single person and being unfamiliar with data structures and classes led to many slowdowns when I came to a problem that I wasn't sure how to fix. Two major examples of this would be when the Mapper class did not initialize a table variable correctly due to me not creating a slot for it in __slots_ and me being unable to properly check when a file was smaller than the checked file due to me only implementing lt, not gt, le, or ge

Accomplishments that we're proud of

Getting the program working with some time to spare. The issues required me to expand my knowledge of classes and getting them to work with each other. Making my own class for splitting files up and comparing them is inefficient, but was very fun to get done.

What we learned

Much about creating classes, oncluding adding comparison functionality. Some learning about libraries such as os and difflib. Difflib was only used in early versions as it's SequenceMatcher library has some functions that many tutorials online recommend for basic file comparison. Difflib was too easy though, so I also learned to split and append files to my Mapper class.

What's next for Plagiarism Checker

In the future, I would like to potentially implement a GUI for this program as well as improve the time complexity as the current algorithm for checking files is very unrefined and slow. This shows in larger files, and if the files held in the Files folder were to increase to a very large amount, the program would deal with massive slowdown. A way to store already parsed files would help tremendously, as well as a way to improve the map cross-checking.

Built With

Share this project:

Updates