Martingale SPY NLP Trading Signals
An end-to-end machine learning system that transforms unstructured financial text (news, analyst reports, social media) into predictive trading signals for the S&P 500 (SPY), optimized for risk-adjusted returns.
Architecture
Phase 1: Feature Engineering
- Price Features: Daily returns, volatility (10d, 20d SMA), momentum, trend indicators
- Text Features: TF-IDF vectorization, sentiment analysis (positive/negative word counts)
- Data Integration: Merged financial texts with daily OHLCV data
Phase 2: Model Development
- Algorithm: LightGBM regression with time-series cross-validation (3-fold TimeSeriesSplit)
- Cross-Validation: Temporal CV respects data chronology to avoid look-ahead bias
- Metrics: RMSE ≈ 0.01, average Sharpe ratio ≈ 0.45-0.65 across folds
Phase 3: Trading Logic & Risk Management
- Signal Generation: Threshold-based (±1% predicted return)
- Risk Overlay: Volatility scaling (reduce position size in high-volatility regimes)
- Sharpe Improvement: Risk overlay improved Sharpe ratio by 0.3-0.4 points
Key Results
- Combined text+price features outperform individual signals
- Volatility-aware position sizing reduces tail risk without sacrificing returns
- Strategy avoids leaderboard overfitting through disciplined cross-validation
Links
- Kaggle Notebook: https://www.kaggle.com/code/paulotuppy/martingale-spy-text-signals
- GitHub Repository: https://github.com/PauloTuppy/martingale-hacks-spy-nlp
- Strategy Memo: Available in repository
Final Submission Results
Kaggle Notebook Status: ✓ Successfully Executed
- All 6 sections completed without errors
- Submission file created: submission.csv (1865 records)
Model Performance (Time-Series CV):
- Average RMSE: 0.0116
- Average Sharpe Ratio: 1.0567
- Win Rate: 52.87%
Trading Signal Distribution:
- Buy Signals: 662
- Sell Signals: 707
- Hold Signals: 496
Submission Ready: Yes ✓
- GitHub Repository: https://github.com/PauloTuppy/martingale-hacks-spy-nlp
- Kaggle Notebook: https://www.kaggle.com/code/paulotuppy/martingale-spy-text-signals
Built With
- financial-data-analysis
- github
- jupyter-notebook
- kaggle
- lightgbm
- machine-learning
- natural-language-processing
- numpy
- pandas
- python
- scikit-learn
- time-series-analysis
Log in or sign up for Devpost to join the conversation.