Inspiration

  • Our interest in quantifying social sentiment surrounding the stock market with machine learning and natural language processing alongside our activities in the Applied Mathematics club, and the Electronics and Engineering Club at Berkeley city college (open courseware) led us to undertake this project.

What it does; How we built it

While we set out to do a lot: to build a dynamic system intaking new data with which we could reinforce our ML model, we chose to stick to a fixed data set to offer historical data on sentiment analysis, and we built a platform using Figma, supplemented with our own Java Script and used Flask to connect the front and back-ends to offer an aggregate sentiment (bearish, bullish, neutral) for a given ticker.

Challenges we ran into

We ran into some problems involving connecting user input with our ML pipeline; Implementing live web scraping to create a dynamic data set and creating dynamically updating threshold sentiments developed specifically for each ticker using our own algorithm. We also had trouble with measuring alpha generation, actually back-testing our recommendations for position management based on our readings for sentiment analysis was a difficult task.

Accomplishments that we're proud of

Overcoming the hurdle of connecting user input on our website with our ML pipeline, and being able to limit our scope when we needed to.

What we learned

Version control across different languages within a short time frame (especially given that we were using micro web frame works (i.e flask) : JavaScript, CSS, and Python) across different platforms (PyCharm, Visual Studio, Git, Google Collaboratory)

What's next for probabilistic-sentiment-analysis

What's next is ramping up to live sentiment analysis, web scraping social media with live APIs, position management recommendations across different time-frames. We also want to eventually integrate GPT-3 as a sort of sandbox feature for users to determine sentiment of their own input and use cohere for generating synopsis of articles and use our algorithm for sentiment analysis of the generated synopsis to evaluate the presence of strong biases before articles are published.

Share this project:

Updates