Inspiration
Public transit is a vital part of urban life, connecting people to work, school, and leisure activities. However, delays can cause significant disruptions, leading to frustration and lost productivity. As frequent users of the TTC, we were inspired to develop a predictive model that helps commuters make informed travel decisions. By forecasting subway delays, we aim to improve the rider experience and contribute to more efficient urban mobility.
What it does
Our project consists of three main components:
- Exploratory Data Analysis (EDA): We analyzed TTC subway delay data, visualizing trends and identifying key factors contributing to delays.
- Machine Learning Model: We trained a Random Decision Forest algorithm to predict delays based on user input (starting station, destination, departure time, etc.), providing both delay predictions and probabilities.
- User-Friendly Web Application: We built a website where users can input travel details and receive real-time delay predictions.
How we built it
- Data Processing & Analysis: We used Pandas to clean and analyze the dataset, extracting meaningful insights.
- Machine Learning Model: We implemented a Random Decision Forest model using Scikit-learn. We trained and tested the model on historical TTC delay data, achieving an accuracy of 73%.
- Web Development: We developed a frontend using Next.js and Tailwind CSS and connected it to our backend model using a Flask API.
- Deployment: The web application is hosted on Vercel, making it easily accessible for users.
Challenges we ran into
- Data Quality Issues: The dataset contained missing and inconsistent values, requiring extensive preprocessing.
- Feature Selection: Identifying the most relevant factors influencing delays was challenging, as subway delays depend on many external factors.
- Model Optimization: Balancing model complexity and accuracy while avoiding overfitting took several iterations.
- Web Integration: Connecting our machine learning model to a functional web interface required careful API design and deployment strategies.
Accomplishments that we're proud of
- Successfully building a working delay prediction model with a 73% accuracy.
- Developing an interactive web application that allows commuters to check delays in real-time.
- Creating clear and insightful visualizations that reveal delay patterns across the TTC network.
- Overcoming technical challenges as a beginner team in data science and machine learning.
What we learned
- The importance of data cleaning and preprocessing in machine learning.
- How to implement and optimize Random Decision Forests for classification tasks.
- How to integrate a machine learning model into a web application using Flask.
- Best practices in team collaboration, version control, and API development.
What's next for TTC Subway Delay Prediction
- Enhancing Model Accuracy: Experimenting with deep learning techniques or ensemble methods.
- Expanding Data Sources: Incorporating weather data, traffic conditions, and real-time TTC updates for improved predictions.
- Mobile App Integration: Developing a mobile-friendly version for easier accessibility.
- Real-Time Updates: Implementing a live-feed system to continuously improve delay predictions.
- Collaboration with TTC: Exploring opportunities to collaborate with the TTC for official integration into transit planning tools.
Log in or sign up for Devpost to join the conversation.