We were riding the new west metro and figured that HSL have a limited view of how many people are using the metro and where they are going.
We thought that we could just film people coming up and down the escalators and tell them exactly at which stop a person stepped on and where they hopped off.
What it does
It counts the amount of unique visitors seen across video streams and when and where an individual person was spotted. This enables us to compute how many people took each route, how many were seen at any given station and much more.
The applications for this technology are vast. Junction, Slush and other events want to know how people behave and where they spend time. Malls want to know how many visitors they have and which stores they visit. Buses and bus stops could gather data of where their passengers travel.
How we built it
We use face detection to cut out faces from video frames. The cropped faces are preprocessed to "straighten" the face. The cropped images are passed through a deep neural network which embeds the faces into a high dimensional vector space.
The resulting vectors encode the different features of the faces. This means that they can easily and reliably be compared to each other to distinguish between different persons.
This means we can figure out which sighting matches any given person and we can track them across filming sites.
The frontend is built using state of the art front-end technology which gives the user an easy to use interface to interact with and view the data.
Challenges we ran into
Recognizing and analyzing faces is not easy. This took the bulk of our time. It was tough to gather good quality data for the demo. Front-end development was a challenge for us. Luckily we managed to recruit two hardcore front-end devs on Saturday evening.
Accomplishments that we're proud of
We managed to pull off the heavy lifting in the image processing. We wrapped the whole thing together despite the huge workload. It works pretty well considering it was put together in a weekend.
What we learned
Computer vision is hard. Front-end development has progressed and become much more complex over the years.
What's next for CrowdMill
Improving our image processing pipeline to achieve better accuracy. Trying it out in the real world.