Inspiration
We were inspired by the NJ Transit App's arrival prediction feature, which alerts users when a train is within X minutes of their station and shows them in the app when a train will arrive at the station.
What it does
Our project is an enhancement of that feature, providing fine-tuning by introducing AI to the prediction algorithm. Our understanding of the current algorithm is that train location is the only variable taken into account, so we decided to accommodate for the addition of multiple other variables in order to improve the accuracy of arrival predictions.
How we built it
To accomplish this we created a machine learning algorithm that predicts train arrival time given historical arrival data. We chose CatBoost as our machine learning algorithm, since it has recently blown up in popularity and we figured it would have the greatest success since we are working with easily categorized data (and CatBoost is designed for categorical boosting). The variables considered by our algorithm are month, day, hour, and minute of arrival, as well as the beginning and end stations, rail line, and precipitation data during movement.
Challenges we ran into
One of the primary challenges we faced was finding enough contributing variables to predict train delay accurately, without adding too many that it would be impossible to find the data needed in a timely manner. We overcame this challenge by incorporating as much of the data provided as possible and limiting the external sources of data to precipitation and departure data.
Accomplishments that we're proud of
We're proud to have gotten our root mean square error down to 4.6 minutes, and with further training, better hardware, and the introduction of more contributing variables to train on, we are confident that this figure can be lowered even further.
What we learned
We learned which machine learning algorithm worked best (CatBoost) for our dataset by comparing different algorithms with each other, which gave us the opportunity to learn more about popular machine learning algorithms and what variety of data works best with each.
What's next for Arrival Prediction Correction System (APCS)
Integration into the NJ Transit App, where our enhancement would reflect in the arrival prediction and provide users with a more accurate estimate of when their train is coming. We hope that this will provide regular users of NJ Transit with an improved traveling/commuting experience, and get them where they need to go in a convenient and timely manner. We are also looking to improve the accuracy with which our predictions are made, and to further train our model to further this end.
Log in or sign up for Devpost to join the conversation.