I wanted to make an application that tracks the popularity of certain memes on Twitter using text analysis.
What it does
It queries for the most recent tweets posted online and compares them to the tweets that we've trained on in the past.
How I built it
I used the twitterscraper Python library to scrape tweets for certain memes. I then streamed those into a Google BigQuery database where I did some SQL queries to make the data more usable. I then worked on using Google Cloud DataFlow to manipulate the text of the tweets into something more desirable (by removing emoji and by shortening urls to just their hostname). I then used TensorFlow to classify the text according to certain memes.
Challenges I ran into
I was having a lot of trouble working with DataFlow, as it was something I had never worked with before. Additionally, the training sets are rather poor as I didn't have time to go in and check all the tweets to make sure they were properly classified.
Accomplishments that I'm proud of
I did some cool Machine Learning stuff with platforms I had never used before.
What I learned
Learned a lot about Google Cloud Platform, Tensor Flow and text classification.
What's next for Meme Learning
I really want to continue polishing it and also properly categorize the data so the models will be better.