Initially, we wanted to create an app with Microsoft's Linguistics Analysis API to judge the difficulty of an english article based on the common vocabulary of a non-native speaker. However, due to the limitations of Microsoft's Language Processing API for our purposes, we based our application instead from the Japanese Language Proficiency Test to judge the difficulty of Japanese text articles based on the character complexity.
What it does
EasyReadJP parses through text from articles from a user provided URL, and returns the level of knowledge required to read though the source by JLPT(Japanese Language Proficiency Test) standards.
How we built it
In order to recieve information about each Japanese character's complexity level, we parsed through jisho.com's dictionary results for each individual word. We used Ruby on Rails as a framework, borrowing from the open-uri and Nokogiri libraries for web scraping capabilities, and after recieving text from a URL, we also scraped information from the dictionary entries of each word, computing a percentage result of the overall difficulty of the text, and its associated JLPT level.
Challenges we ran into
As our first web application created using the Rails framework, departing from front-end only web "applications" requiered us to get used to standard development guidelines, such as the Model View Controller concept. Routing form information to ruby scripts, and dealing with japanese unicode were some examples of the problems we faced during development. Luckily, a few of the participants and booth members were able to assist us through the creation of our application!
Accomplishments that we're proud of
Successfully building our first fully running web app, despite being relatively small in scale, is a large achievement for us, having only ended previous hackathons with largely unfinished projects.
What we learned
We explored the capabilities of the Ruby language and the Rails framework through every turn of the project whenever we ran into problems, hopefully allowing us to reduce development time for future hackathon projects, or even personal ones. We learned how to take advantage of Ruby gems for character processing and web scraping.
What's next for EasyReadJP
If we ever decided to continue this project to help students or users learn japanese by identifying plausible reading sources, instead on having the user provide a url for us to parse, it would be exponentialy greater for us to provide a curated list of links that have already been parsed through immediately to users visiting the site, making the process of searching for reading material more effective and the experience smoother.