I'm not hating on all books here but some books are definitely out of my league. I find myself googling "define: insert word here" every other sentence on sometimes the same word five or six times (which is pretty disheartening). So what if I had a tool that could put those definitions next to the word, or better yet change it altogether into something simpler? And thus from my sin of sloth LazyReader was born.
What it does
LazyReader is a program that determines the difficulty level of words in a sentence and replaces said word with an easier to understand synonym. You type in a sentence, feed it a difficulty based off a given word database, and whabam your sentence is now at that difficulty level of words (may oversimplify a tad bit).
How we built it
Using a dictionary API and a Part of speech identifier, LazyReader can tag tokens within a sentence to be simplified down the line. We rank the difficulty of words based on a frequency list from NGSL (31k words ordered from most to least frequent) and assume that the more frequent a word is, the higher in difficulty and applied a exponential bin of some x to categorize the words. We can then feed a test to the user which will find their difficulty based on this list and spit out the altered sentence in with an approximation of the user's level of difficulty based on how well they performed on the test.
Challenges we ran into
It was difficult overcoming English grammar, we originally ran into the project thinking we were going to do everything from verbs to adjective to adverbs. Turns out the tools out there are few and even then we have to begin to worry about other things such as verb tense and pluralization. We settled for nouns only (non-proper) and figured out a way to handle pluralization using a more specific Part of speech identifier which allows us to separate more complex plural words (e.g. geese, mice, etc... darn that English language) without listing all the rules for each. It was also really annoying to have to adjust the filepath every single time when we pulled until we found the existence of File.seperator and System.getProperty("user.dir") which helped us save a lot of time and make it more usable on more OSs with different file separating characters.
Accomplishments that we're proud of
We're just happy that it works consistently on multiple clones. It was already an undertaking trying to get everything set up, and with people working on multiple OSs and localizations we found other difficulties that we had never expected from unzipping and just cloning the repo on multiple computers.
What we learned
Teamwork makes the dreamwork. Everyone played a part in the team and everyone came from varying sequences, we exchanged knowledge along the way (learning git, learning dependency handling etc.). Programming with everyone pushing at basically the same time (on the same thing, creating conflictions) introduced a new set of errors that we did not expect, but was a welcome addition to something that we learned.
What's next for LazyReader
Bigger databases for more accurate difficulty processing, better Part of speech models such as the Standford one. Verb handling and the stars above. The future possibilities of LazyReader are limitless until someone decides to write a program that paraphrases paragraphs, then I'll never have to read another complex book ever again.