On the surface, it seems like whenever some headline comes about about a new product or service, markets react. Our original goal was to use Google's AutoML Natural Language classifier to create a model that could determine whether or not a certain index would go up or down based on the headlines from that day. However, API rate limits prevented us from getting enough headlines to create a reasonable dataset for our ML. Because of this, we decided to pivot to the next most interesting thing that could be toying with the market: Trump's tweets.
What it does
Our site allows users to "take control" of Donald J. Trump's Twitter account and use it to wreak havoc on the S&P 500. Type in your proposed tweet and observe how our ML model anticipates the index will be affected.
How we built it
First, we created a tool that would allow us to generate large quantities of training data for our model. This involved creating a classification system for the change in an index for a given day, and a Twitter scraper that downloads the President's tweets. Next, we paired each of our tweets with the classification from the day that the tweet was published, rebalanced our training data, and uploaded it to Google AutoML for training. Once the training is done, our React.js frontend and Node.js backend allow the user to use the model to predict changes and visualize the result.
Challenges we ran into
Accomplishments that we're proud of
What we learned
Quality is just as important as quantity when collecting training data. We knew that the more data we could collect, the better our predictions would be, but we learned more about the quality that is required of that data and some strategies that we can use to clean out useless noise.
What's next for tweetliketrump.online
The site currently only shows how a single tweet will affect a single change on the S&P 500. We believe the user experience will be more fun and impactful if changes made by each tweet can be persisted so that users can simulate themselves tweeting throughout the day. Our ML models can definitely be improved. Getting higher API limits and permissions will allow us to collect a more meaningful number of Trump tweets and it may also allow us to return to our original goal of using news headlines to make predictions.