Title: The Effect of Elon Musk’s Tweets on the Crypto Market
Final Report: https://docs.google.com/document/d/1Pla3F9b_DRsV_OusKTwGiB8_u0n3fqLbSlWCxjntdYA/edit#
Second Check-in Reflection: https://docs.google.com/document/d/1d3vl_8wykUyIdo-p9Ujvgfeg2-5IK4F1IGcij79XJpU/edit
Better Resolution of Our Poster: https://docs.google.com/presentation/d/1R09AbE7EElFIiWoTEm_MpiQ7W9CQ8-AnF6cRzm5oLSY/edit#slide=id.p
Who: Dharam Madnani (dmadnani), Idilsu Guney (iguney1)
Introduction: We’re planning on examining how Elon Musk’s tweets affects the trading price of cryptocurrencies (bitcoin, ethereum, dogecoin)
- If you are doing something new, detail how you arrived at this topic and what motivated you.
- We have always been interested in cryptocurrencies and the opportunity to look at a data-driven approach of how an influential figure such as Elon Musk can affect such prices is extremely interesting.
- What kind of problem is this? Classification? Regression? Structured prediction? Reinforcement Learning? Unsupervised Learning? Etc.
- Structured prediction (for converting figures produced by sentiment analysis to fluctuations in stock prices), Natural Language Processing (for sentiment analysis)
Related Work: Are you aware of any, or is there any prior work that you drew on to do your project? There are no research studies conducted on this specific topic, which is good for us as we want to focus on something new. However, there are some news articles that focus on how Musk’s tweets have an essential effect on the trading price of cryptocurrencies. For example, a recent article on CNBC talks about the concern regarding the extent of the impact Elon Musk’s tweets have on the stock market, and more specifically, on the cryptocurrency stocks. While it’s interesting to analyze how Elon Musk can have this much influence on the stock trading prices, the article emphasizes the risk Musk puts on retail investors as the unstable stock prices can have devastating consequences for the traders. Because of this, the article debates whether or not it should be acceptable to tweet about stocks on Twitter and if Twitter should enforce regulations on people with a big following such as Elon Musk.
- List of related articles:
Data: What data are you using (if any)?
- If you’re using a standard dataset (e.g. MNIST), you can just mention that briefly. Otherwise, say something more about where your data come from (especially if there’s anything interesting about how you will gather it).
- We’re going to use a bigger set of tweets for our training data to understand good and bad sentiments in order to conduct sentiment analysis. We used this dataset from Kaggle: https://www.kaggle.com/cerolacia/covid-19-tweet-classification/data
- Then, we’re going to use a smaller dataset containing specifically Elon Musk’s tweets that mention references to cryptocurrencies, which we will convert into a set of figures using our sentiment analysis model that was trained on the bigger dataset. The produced figures will help determine a sentiment. In this step, it will be important for us to also consider tweets that don’t include the exact name of the cryptocurrency but other nicknames and slang references, such as “doge” instead of “dogecoin”. We used this dataset from Kaggle: https://www.kaggle.com/ayhmrba/elon-musk-tweets-2010-2021?select=2011.csv
- We will convert the figures produced by our sentiment analysis into effects on stock price. In order to achieve this, we’re going to use a timeseries dataset of respective cryptocurrency prices over time and compare the fluctuations with our results. In this step, it will be important for us to consider the time between when Musk tweets and when there is a difference to the stock price potentially caused by Musk. We used the timeseries dataset from Yahoo Finance for each respective cryptocurrency.
- How big is it? Will you need to do significant preprocessing?
- We’re going to choose a moderate size for our bigger training dataset, so that the size will be big enough to produce a sentiment analysis model but not too large so that our preprocessing will take a reasonable amount of time.
- Elon Musk’s tweets regarding cryptocurrencies will be relatively small: around 100-200 tweets.
- The size of the timeseries dataset of cryptocurrency stock prices will depend on the timeframe we will choose to look at: for example, the timeframe between 30 minutes to 24 hours after Musk tweets.
Methodology: What is the architecture of your model?
- How are you training the model?
- We’re going to use 2 different models for our project. The first model will make our own sentiment analysis through using natural language processing and will train on a larger dataset of tweets. We wanted to train our first model on a larger dataset, as the dataset for Elon Musk’s tweets on cryptocurrencies is too small to achieve high accuracy rates.
- Second model will perform our developed sentiment analysis on Elon Musk’s tweets and will predict the fluctuations in stock market prices of cryptocurrencies. Based on the date of the tweets and the results from our sentiment analysis, we will develop a prediction model regarding how Elon Musk’s tweets affect the trading price of cryptocurrencies using the timeseries dataset.
- More specifically, we will work on creating a model that converts Elon Musk’s tweets into a set of figures that represent a sentiment, which will then determine movements in cryptocurrency prices.
- If you are doing something new, justify your design. Also note some backup ideas you may have to experiment with if you run into issues.
- Currently, our idea for our prediction model (second model) is as follows: after using our sentiment analysis created from the first model, we will obtain a set of figures that help predict whether the stock prices of a cryptocurrency will go up or down. However, we’re not sure whether these figures will be able to accurately predict by how much a stock price will go up or down. Therefore, a backup idea could be to use these figures to only predict whether a stock price will go up or down and use the magnitude of these figures to represent certainty (instead of stock price going down by a certain amount, going down with 70% probability, for example).
Metrics: What constitutes “success?”
- What experiments do you plan to run?
- After converting Musk’s tweets into figures and using these figures to predict fluctuations in stock prices, we will compare our predictions with the actual movements in the stock. We’re going to implement this comparison for each cryptocurrency that we’re going to consider (bitcoin, ethereum, dogecoin).
- For most of our assignments, we have looked at the accuracy of the model. Does the notion of “accuracy” apply for your project, or is some other metric more appropriate?
- Accuracy applies for our project, since we can compare the results of our testing data (and the predictions on the impact Musk’s tweets have on certain cryptocurrency stocks) with the actual value of the cryptocurrency stock.
- If you are doing something new, explain how you will assess your model’s performance.
- We’re going to assess our model’s performance by whether the model can accurately predict the direction of the movement, as well as the magnitude. We can use some intervals to assess accuracy with the magnitude prediction: for example, if our model’s prediction is within +/- 0.5 percentage points of the actual magnitude, then we will consider our results to be accurate.
- What are your base, target, and stretch goals?
- Base goal: Accurately predicting the direction of the movements in the trading price of cryptocurrencies.
- Target goal: Accurately predicting the magnitude of the movements in the trading price of cryptocurrencies with a larger interval: such as within +/- 0.5 percentage points of the actual magnitude.
- Stretch goal: Accurately predicting the magnitude of the movements in the trading price of cryptocurrencies with a smaller interval: such as within +/- 0.2 percentage points of the actual magnitude.
Ethics: Choose 2 of the following bullet points to discuss; not all questions will be relevant to all projects so try to pick questions where there’s interesting engagement with your project. (Remember that there’s not necessarily an ethical/unethical binary; rather, we want to encourage you to think critically about your problem setup.)
- What broader societal issues are relevant to your chosen problem space?
- Whether social media platforms should enforce restrictions on the content that users with a large following base share is a societal issue that is relevant to our project. Since users such as Elon Musk have a lot of impact on the decisions of people who are following them, this can, in some cases, lead to negative consequences. In this specific example, Musk puts a big risk on retail investors as the unstable stock prices can have devastating consequences for traders. Hence, the issue of whether social media platforms such as Twitter should enforce regulations on what verified users (celebrities, etc.) can and cannot share is open to debate. However, implementing these regulations can also cause other concerns, such as limiting freedom of speech online. Our model will assess the extent of the impact Musk has on the trading price of cryptocurrencies: if the stock prices move a lot when Musk tweets about a certain cryptocurrency, then this will impact traders negatively and create additional speculation on whether Twitter should restrict the content of Musk’s tweets.
- Why is Deep Learning a good approach to this problem?
- Deep Learning is a good approach for this problem because we’re able to analyze almost every tweet Musk has shared regarding cryptocurrencies and their impact on the trading prices. As Musk has numerous tweets, it wouldn’t be possible to conduct detailed analysis of these tweets and their effect on the fluctuations of cryptocurrency stocks without using tools of Deep Learning. Additionally, Deep Learning will be very helpful in conducting sentiment analysis of tweets and converting a sentiment into figures that would represent that sentiment. By using these figures, we’ll be able to tell by how much Musk’s tweets are expected to influence the trading prices.
Division of labor: Briefly outline who will be responsible for which part(s) of the project.
- Finding and preprocessing the larger training data of tweets that we will use in our sentiment analysis model: Idilsu
- Scraping Elon Musk’s tweets about cryptocurrencies and using sentiment analysis model on this dataset to convert tweets into a set of figures: Dharam
- Using these figures to predict fluctuations in stock prices of cryptocurrencies using the timeseries dataset of each cryptocurrency
- Ethereum: Idilsu
- Bitcoin: Idilsu
- Dogecoin: Dharam
- Calculating our accuracy rate: Dharam
- Preparing the poster: Idilsu and Dharam
Built With
- python
- tensorflow
Log in or sign up for Devpost to join the conversation.