After rising trends in the last year, where social media has started to intersect with the finance industry. Stocks like Tesla, Palantir, cryptocurrencies, and—more recently—"meme" stocks like GameStop and AMC have blown up on social media 😎. Our group was inspired by this recent trend to create...
What it does
...a website that aggregates discussions across different subreddits related to trading—r/investing, r/stocks, r/securityanalysis, r/finance, r/robinhood—and categorizes discussions into relevant trending topic cards for users to conveniently find news and analysis about the most pertinent stocks and ETFs in the market. With social media becoming a greater impact on finance in the future, it is favorable for more attention to be focused in that direction.
How we built it
For our website, we used a bootstrap framework for styling, hover.css and animate.css for animations to get responsive and dynamic content. To extract the data from Reddit and understand it, we used the praw library and used natural language processing (NLP) techniques to comprehend and categorize popular discussion topics and stocks.
We took these topics and aggregated them onto our website into easily comprehensible cards for users to interact with.
Challenges we ran into
We tried numerous NLP approaches, many of which did not pan out—including topic modeling and tf*idf. We ended up using the Bidirectional Encoder Representations from Transformers (BERT) model to embed our raw text into computer-friendly numbers and the HDBSCAN clustering algorithm to group together potential topics. For relevant stocks, we used RegEx to parse. Even after tinkering with so many approaches our final result still flags certain irrelevant topics.
Additionally, due to time constraints we weren't able to successfully incorporate other social media sources, such as twitter or facebook,
Accomplishments that we're proud of
This was all of our first hackathon so we were happy to end up with a working website and nice-looking final result.
What we learned
We applied high level NLP techniques to a real-world applicable problem.
What's next for substonks
We would like to further incorporate the additional social media sources that we didn't get to do. Furthermore, our topic scraping could use great improvements in the future, possibly even adding a text summarization feature.
Created by Maxwell Bai, Avik Rao, and Jenna Li Team 12