Web-scraping - CNN-LTSM stock prediction pipeline

Inspiration

What it does

The project can web scrape data online for each stock listed on the S&P500 and normalise it, so that it can be processed by a ML model. The model we chose was a CNN-LSTM model. This is an amalgamation of a CNN model, which initially processes the input data. The output tensor is then passed to LSTM as its input, where LSTM will further process the data. After training is complete, the model is tested on test data (also webscraped) and an overall accuracy score (out of 0-1) is outputted along with a predicted stock price for each company on the S&P500.

How we built it

We built it by researching into existing models and CNN-LSTM research papers. We also used the keras library.

Challenges we ran into

None of us have had any prior experience in building neural networks, so the learning curve was quite steep.

Accomplishments that we're proud of

We can predict stock price data at an accuracy of approx. 52% (For reference counting cards gives you a 51% of winning).

What we learned

We all now have a much better understanding on how neural networks are built and function. Along with this, we learnt multiple skills for data normalisation and labelling.