Inspiration

It's a common sight on metro services across the world - crowded trains, the crush on the platform and the slow trudge through the station to get where you're going. Looking at London Underground, we can see demand is projected to increase while we continue running Victorian infrastructure.

As part of HackTrain 4.0, we were given access to WiFi data from the Toronto subway network, which anonymously tracks individual devices as they pass through various parts of a station. We know that people move slower when the station is busy, and with this data we could see exactly how long it took for people to interchange between metro lines. That got us thinking - could we use that data to understand real-time crowding information, and pass that on to customers, operators and even third parties?

We thought we'd find out!

What it does

Devised over a 48 hour Hackathon with plenty of idea generation and brainstorming, Platform currently does something simple, but crucial. By analysing over 30 million datapoints, we have been able to generate the average interchange time in 10 minute intervals across an 8 hour period at St George's station in Toronto. We've also been able to look at other key metrics, such as the derivative of this to see the rate of change of people entering a station, and the cumulative counts.

This gives us a proof of concept - we're able to crunch the numbers to get a baseline that could be used for a huge variety of applications, including operational performance monitoring around congestion (something that is currently very difficult to quantify), potential early warning systems for stations about to be hit with congestion, and even a notification system for customers to advise them if they need to change their journey.

We have plenty of ideas as to how Platform could be developed further - more on that in the final section!

How we built it

We used the WiFi dataset provided by Bai Communication. This data covered the period between 8th – 12th November 2017.

The data was split into second by second intervals, and included:

  • Hashed unique ID
  • Timestamp
  • Station
  • Station floor

Methodology:

  • Kibana was used to help us understand the data we are using and the samples to pick from;
  • Extracted a small sample size (10 mins worth of data) during a week day;
  • Produced an algorithm in Javascript that:
    • Identifies unique IDs (individuals) at St. George station who have made an interchange and those who have continued on in their journey;
    • Identifies the direction of interchange (platform to platform);
    • Calculate the dwell time derived from time they have arrived “seen” from one platform to when they have left “disappeared” the other platform;
    • Filter out anomalies and outliers in dwell time;
    • Take average dwell time.

The data is displayed on an interactive dashboard with additional tools such as statistical analysis and triggers

Challenges we ran into

We had two main challenges in reaching this stage of the project. Initially, our challenge was deciding on what to do with the data - WiFi data is a relatively new product with a huge range of applications, including analysing customer demographics, targeted advertising, mapping routes around a network and of course understanding passenger flows in stations. It took us a while to decide on an idea that was not only interesting, but answered a key problem: how do we understand interchange flows?

Once we had agreed on an idea, we knew what we had to do - but contending with over 30 million data points for just one day was no mean feat!

Accomplishments that we're proud of

Reaching the point where we had a baseline that proved it could be done was a major achievement - this tells us that more can be done, and there is potential for the Platform product.

What we learned

Exploring the options for what WiFi data can do was eye-opening. Any one of the ideas we generated together could have been taken forward as a project, but ultimately we wanted to find one that would add value to the industry.

What's next for Platform

We have a few ideas for our longer term pipeline! Some examples are:

  • Digital Out of Home advertising: Possibility to use timestamp data and web search data to target advertising to groups or even individuals.

  • Machine learning: Over time the system can learn when interchange time shows serious congestion, and automated alerts could be sent to station management and customers.

  • Improved station modelling: Current station modelling is based on assumptions, but WiFi data could be used to inform and crossreference models to improve their accuracy.

Built With

Share this project:

Updates