Introduction:
- In this unprecedented and unpredictable era of COVID-19, the world, including the stock market, is in uncharted territory. We will use the number of daily COVID-19 confirmed cases to create a model that forecasts the commodity and stock prices, since we believe that the health of the people is correlated with the health of the economy, and thus, stock prices. We will be implementing a research article that uses a bidirectional LSTM to analyze the impact of COVID-19 on forecasting stock prices. We chose this paper because the model and analysis used in the article largely align with what we wanted to accomplish, which was to use deep learning models to forecast stock prices based on COVID-19 data. This problem is structured prediction, since we will be using historical trading data and COVID cases to predict future stock prices. Although this is a challenging task, we are excited to apply our Deep Learning knowledge to tackle the complex financial markets.
Challenges:
- One challenge that we encountered is that downloading data is not very straightforward. We realized that the easiest way to import the stock data is using the Alpha Vantage API and more specifically the Time Series Stock API. Moreover, there are restrictions with using the API, and we needed to write a script that downloads the stock data over a certain time period.
- The actual magnitude of the stock prices ($30/share, $2000/share, $450/share) vary a lot. We need to normalize the stock prices between -1 and 1 so that our model can effectively train on each one of the stock prices as part of the dataset. This is one challenge that we didn’t really anticipate, but encountered as we thought more deeply on how to execute and implement the model.
Insights:
- The model was performing very badly due to not normalizing the data. We have a LSTM that will bind the values between -1 and 1 due to the tanh activation, and we expected the dense layers to adjust for this. Obviously the model can’t learn that easily, but we feel that we can normalize the data and then predict the “relative” price. Due to this we have “results”, but they are very, very bad.
Plan:
- Yes, we are on track with our project, since we have collected stock data, preprocessed it, and started implementing our model. And so, we are glad that we have the majority of our baseline goal done — we just need to normalize the stock prices and train the model (at least we think normalizing will fix this)! However, to meet our target goal, we still need to incorporate COVID data and use that data to further train and improve our model.
- We are not thinking of making any major changes, but we did make some smaller decisions as we made progress on the projects. For instance, we decided to use daily prices from 50 active stocks from 1991 to present as our historical data, we are using the same stocks from 1/1/2020 to present for the stock data for the COVID-era. Our COVID data will be pulled directly from the John Hopkins dataset.
Log in or sign up for Devpost to join the conversation.