What it does
Extrapolating Sentiment Via r/wallstreetbets
On our trial run of the program, we generated a data set of the 10 most mentioned stock tickers and their associated vernacular. We then utilized Asad70’s application (His Project) to perform sentiment analysis upon the top 5 tickers. This analysis returned values for three possible trends — Bearish, Neutral, and Bullish — ranging from 0 to 1. These values were then compounded into a total trend value for the stock in the range of -1 (most bearish) to 1 (most bullish).
What Can Traders Do With This Information?
Traders can exploit the correlation between number of queries, trading volumes, volatility, daily change in price, and bullish/bearish sentiment to predict future stock market trends.
Thanks to the 1-3 day delay between an increase in queries and an increase in trading volume, investors can use real time query tracking and predictive analysis to forecast the activity of stocks several hours preceding relevant activity.
How We Built It
We combined academic research, existing APIS, and new code to generate a predictive tool and the respective trading strategy for investors.
What we did:
- Analyzed 5 academic papers exploring the correlations between the number of search engine queries about a stock, its trading volumes, its volatility, and its daily change in price.
- Created and integrated a personal Reddit Script API
- Modified Asad70’s Application reddit-sentiment-analysis to encompass our Reddit Script API
- Tested our output by generating a data set of the 10 most mentioned stock tickers on r/wallstreetbets and performing sentiment analysis on the top 5 picks
Modified Code
Original
import re
import en_core_web_sm
nlp = en_core_web_sm.load()
Stopwords = nlp.Defaults.stop_words
reddit = praw.Reddit(user_agent="Comment Extraction", client_id="", client_secret="", username="", password="")
Modified
import spacy #import en_core_web_sm
#nlp = en_core_web_sm.load()
stopwords = []#nlp.Defaults.stop_words
reddit = praw.Reddit(user_agent="Comment Extraction",
client_id="qs4uiqJHU2eiLzYneR####", client_secret="MkWEIZsGtkVKQZyWj29KcEjOBE####",username="Up2Early4this",password="###")
Challenges we ran into
Understanding the role of Client IDs and Client Secrets in Reddit APIs
Our team had never tried to integrate a reddit API with code before, so we had to navigate the registration and authentication portion of integrating the API with the codebase.
NLTK Issues
A large portion of our coding efforts went into preparing and utilizing NLTK’s Vader program locally. We experienced several roadblocks, most notably incorporating a library called en-core-web-sm
Built With
- natural-language-processing
- python
- reddit-script-api
- vader

Log in or sign up for Devpost to join the conversation.