Project Story — GR Cup Race Intelligence
About the project
This project is a real-time analytics and strategy engine for the GR Cup series. It processes split timing and telemetry data across 7 tracks (14 races total), computes lap times, consistency scores, and pit stop recommendations, and exposes both an interactive Streamlit dashboard (dashboard.py) and self-contained HTML reports (dashboard_report.html, race_report.html). The consolidated export is all_tracks_results.json.
Inspiration
I built this project because I wanted to explore how lightweight data engineering, time-series processing, and simple physics-based modeling can deliver actionable race strategy. Racing provides a compact, high-stakes environment where small insights (e.g., a lap time delta of 0.5s) can meaningfully change decisions. That combination of measurable performance signals, rich telemetry, and real-world impact inspired the project.
What I learned
- Practical handling of split timing: merging lap starts and lap ends, aligning by vehicle and lap number.
- Telemetry pivoting: converting long telemetry logs (1M+ rows) into per-lap wide-format summaries that are fast to analyze.
- Data validation matters: a small bug in filtering allowed 0.001s outliers to be reported as best laps — robust validation prevents this.
- Building user-friendly deliverables: a Streamlit app for interactive exploration and static HTML reports for judges/stakeholders.
- Tooling and reproducibility: packaging the full pipeline (
process_all_tracks.py→all_tracks_results.json→generate_dashboard.py) and documenting it clearly.
How I built it
High-level pipeline:
- Load raw CSVs (lap starts, lap ends, telemetry, results, weather).
- Merge split timing by
vehicle_idandlap. - Compute lap durations and filter realistic values.
- Pivot telemetry to per-lap features (speed, throttle, brake, gear).
- Compute analytics: best lap, average lap, standard deviation, consistency score.
- Predict pit windows using a simple fuel/tire model.
- Produce outputs:
all_tracks_results.json,race_report.html,dashboard_report.html, and the Streamlit app.
Key code files:
Stragery_engine.py— core engine: data loading, lap-time calculation, telemetry pivot, insight generation.process_all_tracks.py— orchestrator that runs every track/race, consolidates results, and performs final verification.generate_dashboard.py— generatesdashboard_report.htmlfrom the consolidated JSON and validates data before output.dashboard.py— interactive Streamlit UI for deeper exploration.
Mathematically:
- Lap time for one lap is computed as the difference between end and start timestamps:
[ \Delta t = t_{end} - t_{start} ]
- Consistency score (0–100) used in the system is:
[ \text{ConsistencyScore} = 100 - \left(\frac{\sigma_{lap}}{\mu_{lap}} \times 100\right) ]
where (\sigma_{lap}) is the standard deviation of lap times and (\mu_{lap}) is the mean lap time.
- Simple fuel/tire model used to propose a pit window (illustrative):
[ \text{FuelRemaining} = 100 - (\text{current_lap} \times r_{fuel}) ]
[ \text{LapsUntilRefuel} = \frac{\text{FuelRemaining}}{r_{fuel}} ]
The engine uses a small adjustment to propose an optimal pit lap (e.g., (\pm 2) laps around the estimate) taking into account urgency and predicted performance drop.
Challenges faced
- Data naming conventions: each track had slightly different CSV naming patterns. I implemented cascading filename glob patterns to locate files reliably (e.g.,
R{N}_{track}_lap_start.csv,{track}_lap_start_time_R{N}.csv, and a generic fallback*lap_start*R{N}*.csv). - Outlier lap-time bug: an initial filter allowed values
> 0, which picked up invalid splits (0.001s). Fix: require laps to be greater than a realistic minimum (30s) and add multi-stage validation. - Scalability of telemetry pivot: converting long telemetry (millions of rows) into a compact wide per-lap summary required careful grouping and memory-aware pivoting.
- Cross-platform encoding issues: Windows default console encoding required ensuring outputs were encoded in UTF-8 to avoid crashes when printing non-ASCII characters.
Built with
- Languages: Python 3.13
- Libraries: pandas, numpy, streamlit
- Tools: pathlib, datetime, json
- Output: Static HTML/CSS reports and interactive Streamlit app
Cloud / deployment: The project is self-contained and runs locally — it does not require cloud infrastructure for the submission deliverable. The static HTML reports are fully portable for judges.
Data & scale: ~4.7M telemetry records processed, 3,500+ lap times calculated, 14 races, 349 drivers, and 60 generated insights.
How to run (quick)
- Process everything:
python process_all_tracks.py
- Generate the HTML dashboard (validates before writing):
python generate_dashboard.py
- Optional: run the interactive Streamlit app:
pip install streamlit pandas
streamlit run dashboard.py
What I would build next
- Add a small scheduler to run incremental updates from live data feeds.
- Incorporate a lightweight ML model for lap-time prediction per driver/track using historical features.
- Add test coverage and CI (unit tests for parsing, validation, and metrics).
Acknowledgements
Thanks to the open-source Python ecosystem (pandas, Streamlit) and the GR Cup dataset used for this project.
Log in or sign up for Devpost to join the conversation.