Inspiration
The perpetual struggle with unreliable delays and packed commutes forced us to create a solution that allows transit agencies to take the reins. We dreamed of transforming raw, disparate data into glaring, actionable information that would simplify operations and boost rider satisfaction.
What it does
Choo Choo Vroom Vroom Broom Broom gathers information from various sources—transit records, traffic patterns, and weather—and applies machine learning algorithms to forecast delays. By detecting hotspots and forecasting upcoming disruptions, it allows operators to preplan resources and keep passengers flowing smoothly.
How we built it
We made use of Python and popular libraries like pandas and scikit-learn for pre-processing and merging our datasets and subsequently built classification and regression models (XGBoost) to predict the probability and the delay time itself. These predictions feed into a visualization layer that has been built with Folium and Plotly and provides a simple-to-use dashboard for making decisions in real-time
Challenges we ran into
Merging data from multiple sources tested our proficiency in terms of consistency and accuracy. Dealing with missing values, disparate formats, and varied frequencies added an extra layer of complexity. Choosing a perfect set of features to preserve model accuracy without compromising velocity was another problem to solve.
Accomplishments that we're proud of
We were able to build end-to-end in a pipeline strong and interpretable. Our ML models performed stable performance on real-world data, while our interactive dashboards allow our stakeholders to identify problems before things spiral out of control, and not always be scrambling to undo them.
What we learned
We understood the importance of data hygiene and feature engineering for machine learning processes and how important it is to design interactive user-friendly visualizations. We also understood the importance of having the right balance between the complexity of the model and real-time processing so that the insights are available at the right time and can be acted upon accordingly.
What's next for Choo Choo Vroom Vroom Broom Broom
We'll incorporate more outside data—such as real-time traffic feeds and passenger load reports—to further optimize our predictions. Other objectives involve extending to multi-modal transit networks, allowing more detailed disruption projections, and developing predictive maintenance recommendations to reduce downtime. Eventually, we want to be the hub for smarter, data-driven city mobility.
Built With
- folium
- geopy
- matplotlib
- numpy
- pandas
- python
- scikit-learn
- shap
- xgboost
Log in or sign up for Devpost to join the conversation.