People are lazy, they don't want to ready gigantic documents, especially legal ones. When you want to use some "hip, rad, and dope" software, you almost always have to accept a license agreement. Everyone accepts without actually reading the actual legal-speak. We wanted to make it easy to understand what you're signing up for and how it affects you. No one wants to unknowing signing away their first born or a kidney after all.
What it does
Takes input from an EULA text file, summarizes each section under the headings, and show the result to a user in a window.
How I built it
We experimented heavily with rudimentary statements and even with multiple frameworks, like NLTK. Those turned out to be a wash. I then found a research paper which gave me a wordy outline to start with. The code I wrote splits the content down to it's most basic form - plain word lists. I then take that sentence array and find the "score" of the sentence and compare it to a scored sentence from the same section and determine which of those have the most common intersections and choose based on which ones intersect the most.
Dennis was able to create a easy to read GUI using Tkinter. This removed the difficulty that comes with reading a black and white terminal and presents the data in a more spread out fashion.
Challenges I ran into
Configuring the different libraries needed, splitting the text into segments to be parsed, and actually summarizing the whole thing.
Accomplishments that I'm proud of
We were able to accurately summarize a large text document and provide meaningful information about each item.
What I learned
Natural language processing through python.
What's next for EULDR
Discover algorithms that can be used on a wider range of EULAs.
We didn't have time to match with multiple different styles because while there is a set standard for what must be included in an EULA, it's up to the company/licencor/etc. to include clauses and/or apply their own formatting style.