After the Democratic nomination debate a few days ago, we began to wonder just exactly how the millennials of our generation felt about the candidates. How accurate is the stereotype that all college students love Bernie Sanders? In an effort to measure the consensus of our generation's political efficacy, we dive into the application data first hand to see for ourselves.
We assumed that Yik Yak users represent a statistically representative sample population of the campuses.
What it does
This program takes Yaks (or posts) from the Yik Yak platform across a hundred popular campuses across the United States, and produces both a geographical and a lexicographical model of each individual campus and the climate of Yik Yak as a whole. We analogize this to the opinions of college students. While we were too late to retrieve the Democratic Debate yaks because they are recycled on a daily basis, we found that analyzing the yaks even after the debate produced statistically significant trends about not just individual campuses, but about yik yak users as a whole.
How we built it
We scraped each individual Yak after reverse-engineering the Yik Yak Web API, analyzed it semantically utilizing Alchemy's Watson Contextual Analysis to produce the lexicographical model, and plotted these campuses using AmCharts and associated them with the generated keywords.We hosted the database for the sentiment using MySQL, used a MAMP framework, and used Alchemy API for natural language processing.
Challenges we ran into
Accomplishments that we're proud of
Actually producing a comprehensible data model that somebody can analyze without having a background in Data Science or Analytics. Anyone can look at this model and understand exactly what the climate of the campuses is.
What we learned
We learned that college kids really love girls, BONJOURNO ROME, cute asian guys, ahhh warm weather, and girls (did I say that already?).
What's next for YAKMAP
We want to make the design way more modular and constantly updating, and be able to scrape the Yik Yak data in a way that does not nuke their server. The way we have implemented right now is a negligible load on their server, and we believe maintaining that integrity is essential to the ethic continuation of our project.