Inspiration

We were introduced to Markov Chains this semester as part of one of our Computer Science classes. After experimenting with Markov Chains (and witnessing the interesting and humorous outputs that they can create) we decided to make it the core of our project.

What it does

Our project generates a Markov Chain based on the Tweets of Marco Rubio and Donald Trump. The results are certainly interesting!

How we built it

We used the TwitterAPI to retrieve, analyze and post Tweets. We used Python to create modules that can create Markov Chains from text files, and used a Python script to retrieve Tweets, analyze them, create a Markov Chain, and generate/post a Tweet. We used GitHub for version control.

Challenges we ran into

Initial challenges included roadblocks put in place by the TwitterAPI itself. For example, each request can return a maximum of 200 Tweets. We had to work around this by requesting Tweets in batches and filtering each batch to NOT include previously received Tweets. Another challenge was taking Tweet data from mutiple JSON files and writing them to a single .txt file (this was necessary as our markov chains were generated from .txt files). This was solved with experimentation and reading documentation. We were planning to run markov_main.py (our Tweet-generating/Tweet-posting script) on a regular interval/timer. This proved more difficult than we had thought, and remains unsolved; with more time, we probably could have cracked it. Finally, we worked to tailor our Markov Chain algorithm to create Tweet-like output. This meant limiting the length of the output, stopping Tweet generation when punctuation (like exclamation points and periods) were encountered, as well as starting each Tweet with a capitalized word. This reduced instances of cryptic/unintelligible Tweet generation. We also had to filter out the use of any mentions so that our Twitter Bot would no be mentioning other Twitter users.

Accomplishments that we're proud of

We're proud to see that our Twitter Bot is up and running as intended! Some Tweets have been quite humorous and strange, while others are realistic reflections of Trump and Rubio's Twitter activity/style of speaking. It was very interesting to see the similarities and differences between our Twitter Bot's Tweets when we fed it Tweets from different politicians. For example, the Bot generated Tweets of very different tones/style when fed Tweets from Trump vs when fed Tweets from Bernie Sanders.

What we learned

We learned a number of things, but here are some of the most prominent. We figured out rather quickly how powerful APIs can be, but we also realized that API's sometimes don't full align with your development goals. In the case of the TwitterAPI, we were able to retrieve Tweets without having to crawl Twitter ourselves, but had problems with the 200 Tweets per Request limit imposed by the API. We also learned how to optimize an algorithm for a specific task. In our case, we had to optimize a relatively-generic Markov Chain algorithm to create coherent Tweets. Finally, we learned the importance of being stringent with version control. Due to miscommunication and small mistakes, we had some issues with keeping the whole team on the same page.

What's next for Twitter-Markov-Chain

We hope to use the project to further analyze trends among political parties and political groups. We had actually intended to compare the output of the Twitter Bot when fed Tweets from Republican politicians vs. when fed Tweets from Democratic politicians. The hope was to draw parallels not only on political issues, but also to draw conclusions on how different parties "think." We also hoped to draw conclusions regarding the tone of each party's statements. For example, we wanted to see if one party was typically more optimistic or pessimistic when compared to the other. A long term goal would be to make it so that our Bot could reply to Twitter users!

Built With

Share this project:

Updates