Reddit comments are indicative of the trends of companies, and since we are having massive data available online, why not make use of this information and make some profit out of it? Thus we want to make an automated system, where you talk to the API about the company of interest, and it returns a prediction of trend indicating a suggestion whether you should invest in it or not.

What it does

Given a company, our service processes all the related comments on Reddit, and output a prediction on its believe of the future stock price trend of the given company. A user should be able to ask about the company on voice, and the model returns the predicted trend of the stock price for this company.

How I built it

We downloaded submissions from Reddit, scrape all the posts relevant to our list of companies. We preprocessed the data and built a model to predict the stock price for the second day. After training the model and tuned our hyperparameter over a night, we integrated the model into our web service where the user can directly query one specific company by voice, using Nuance Mix.

Challenges I ran into

Too much data to process which means too much noise. We had 80GB of text data to process, which we spent hours on it. Features are not refined, resulting in unstable performance of the model.

Accomplishments that I'm proud of

We built a web service API with a trained solid model that predicts future stock price change in 24 hours. Overall, we achieved about 55% accuracy of prediction by training on 3 months of reddit comment data, which accumulates to 17% in 15 days of aggressive investment profits.

What I learned

We learned how to preprocess massive data efficiently; how to incorporate Nuance mix API into our web server.

What's next for Reddit Profit

Aggregate more real time data from reddit and other sources, eg, the comments and votes from Reddit Work on the API so it will be easy to gather data. Make more comparative analysis and improve the results. Predict over a longer time period for the companies.

Built With

Share this project: