About the Project
The "Stock Market Sentiment Analysis" project bridges traditional stock analysis with real-world sentiment indicators, offering investors a more comprehensive view of market trends. While stock prices often reflect quantitative factors like revenue and earnings, they’re also impacted by qualitative factors such as investor sentiment and news. This project enables data-driven, sentiment-informed decisions in a complex market environment by combining technical indicators with sentiment data.
Inspiration
Investment markets are influenced by both sentiment and hard data. Recognizing the potential of sentiment to impact stock prices led us to create a model that analyzes not only historical price trends but also public sentiment surrounding key tickers. By understanding the tone in news articles and other media, we aimed to deepen our insights into stock movements and investor behavior.
What It Does
The Stock Market Sentiment Analysis project:
- Scrapes Yahoo Finance for stock-related news articles, focusing on specific stock tickers.
- Analyzes Sentiment by categorizing articles as positive, negative, or neutral.
- Combines Sentiment with Technical Data by calculating technical indicators such as EMA (Exponential Moving Average) and SMA (Simple Moving Average) ratios to track trends.
- Predicts Stock Movement using machine learning algorithms, including Logistic Regression, Decision Trees, and XGBoost, to identify patterns between sentiment, technical indicators, and stock performance.
- Backtests Different Strategies to identify the most effective trading approach based on historical data and sentiment.
How We Built It
Data Collection:
- We used BeautifulSoup and requests to scrape Yahoo Finance for news articles related to specific stock tickers.
- For each article, we extracted the headline, timestamp, and content, focusing on the latest sentiment around these stocks.
Data Processing:
- Using pandas, we merged stock data with sentiment scores for day-by-day alignment.
- Technical indicators (e.g., EMA, SMA) were calculated to highlight stock trends over different periods.
Modeling & Prediction:
- We used logistic regression, decision trees, random forests, and XGBoost to predict stock returns based on combined sentiment and technical data.
- Each model was evaluated on metrics like accuracy and ROC-AUC score to assess prediction effectiveness.
Backtesting:
- With the
bt(Backtrader) library, we backtested different strategies to measure returns and risk metrics over historical data. - We calculated each model’s total returns, Sharpe ratios, and drawdowns, selecting the best-performing strategy for each stock ticker.
- With the
Challenges We Faced
Data Volume & Processing:
- Managing and merging large datasets was complex, especially aligning dates and ticker symbols accurately.
Accurate Ticker Recognition:
- Ensuring the model filtered relevant tickers from web-scraped data was essential for meaningful sentiment analysis.
Model Optimization:
- Fine-tuning model parameters to maximize accuracy was challenging and required several rounds of testing.
Technical Issues:
- Handling large files and ensuring seamless collaboration through Git and
git-lfsrequired careful coordination.
- Handling large files and ensuring seamless collaboration through Git and
Accomplishments We’re Proud Of
- Integrated Sentiment Analysis with Technical Analysis: We successfully combined these two fields, creating a model that provides a more holistic view of factors influencing stock prices.
- Optimized Strategies: After testing various strategies, we’re proud to have developed a model that can recommend the most effective approach for each ticker.
- Robust Data Pipeline: The data pipeline efficiently processes real-time news and aligns it with stock data to provide actionable insights.
What We Learned
- Advanced Sentiment Analysis: We learned about using sentiment analysis tools in a financial context, applying this knowledge to stock data.
- Technical Skills with Python Libraries: Using libraries like
BeautifulSoup,sklearn, andxgboosttaught us a lot about data science best practices. - Strategic Financial Insights: Backtesting revealed valuable insights into how market sentiment can drive stock returns.
What’s Next for Stock Market Sentiment Analysis
- Real-Time Sentiment Updates: We plan to integrate a real-time sentiment pipeline for instant insights on breaking news and shifts in sentiment.
- Expansion to Other Markets: Expanding beyond US stocks, we aim to apply this model to global markets, possibly including social media sentiment.
- Integration with Live Trading Platforms: Our goal is to evolve this project into a tool that provides sentiment-informed signals for real-time trading.
Built With
- backtrader
- beautiful-soup
- data-analysis
- git
- machine-learning
- model-generation
- pandas
- python
- sklearn
- yahoo-finance-api

Log in or sign up for Devpost to join the conversation.