Inspiration

We were talking about previous little side projects we had each done. Specifically, Ryan had done some experience scraping data from the Internet, and Patrick had toyed around with a vocabulary learning device for learning language. Patrick also frequently helped non-native speakers develop english intuition. Ryan mentioned how it would be cool to have a "sandbox-like" application to help people learn certain topics in school, and Patrick replied that it would be cool for language learners to be able to input "experiment" language phrases into the sandbox and get some feedback on whether it was something fluent speakers might use. We both thought this was a fantastic idea and we went from there.

What it does

Given a phrase from the user, it returns examples of that phrase in use from Reddit, Twitter, and select news and book sources (corpora). The program also indicates how likely the examples are to be seen in natural language (reliability). It marks a set of examples as reliable if there are at least N qualifying examples that were returned and the average number of retweets or upvotes was above a certain number.

How I built it

Examples are generated using both Reddit and Twitter APIs. Results are then filtered to only include strings with substrings that are matches to the phrase (case-insensitive), and sorted by descending upvotes or retweets. Only the examples with the most upvotes/retweets are selected for display (taking advantage of the internet user's tendency to ignore poorly written posts). The program generates examples from news and book sources using Python's nltk (natural language toolkit) library.

Challenges I ran into

-It was difficult determine the best way to filter potential examples, we tried examples of "good" and "poor" english to determine that retweets/upvotes appeared to be correlated with the quality of writing. -The GUI's widgets that displayed the examples did not clear intuitively, and we had to perform a significant amount of research in the documentation to find a solution

Accomplishments that I'm proud of

-An application that can help developing english speakers gain intuition on the language, and existing speakers to satisfy their curiosity

What I learned

What's next for The Robin Scraper

Built With

Share this project:

Updates