Police often find it a hard time tracking criminals with security cameras around the city, because of the low image quality they usually have. It is hard and exhausting for them to look at each frame and trying to identify a face. So what's a better way? There are a lot of things representing who we are, our face, finger print, voice... but one thing people might have been ignored is one's body's movement. We often have an experience identify a person from distance because of his or her walking pattern. Could that be possibly the next Face ID?
What it does
Identify a person based on his or her walking pattern.
How we built it
We build the movement recognition on Kinect, it tracks one's angular velocity and angular acceleration in every 2 seconds. We then write some helper functions that normalize the data, so that no matter which in which angle you are walking towards the camera, the movement's data will be the same. We want to use machine learning to train the neural network. But it is extremely hard for us to train a network from scratch in such a short time. So we decided to use transfer learning, which is way faster and can give us a high accuracy that could not possibly be achieved by building our own neural network. To use the inceptionV3 model, we wrote several functions that transform the information from the data to images effectively. Then we used transfer learning, rebuilt the last layer of the neural network, to finish up the whole thing.
Our training is based on a 10 minutes walk of 4 people, and the final identification accuracy is 86.7%, which is not bad and proved that people's walking pattern can indeed be potentially a new FaceID: BodyID.
Challenges we ran into
the frequency and accuracy at which we collect our data from Kinect: 40 Hz or 100 Hz? how to deal with data that exceeds our expectation, how to smooth out the data? We tried 40 Hz and 100 Hz and found out that 40 Hz is optimal because a picture would contain the information of approximately two steps – a complete cycle that contains just enough information to determine the uniqueness of one’s movement. For an individual data point, we measured all data with a 0.05s time period and used the average data of that time period as the result, so that the data is smoothed out without outlier and extreme value.
how to extract information that represent the uniqueness of each person’s movement pattern: which data to use? angle? velocity? angular velocity? acceleration? angular acceleration? which body part? We used angular velocity and angular acceleration of ten major body joints to represent one person’s movement pattern, because we know that the arm or leg is moving in a periodic way, which can be modeled as a system of differential equations similar to a pendulum or string movement.
how to represent the relationship between individual data points? different kinds of interpolation, logrithm, and algorithm involved We tried two methods: 1) using different kinds of interpolation to fill the pixels that is not in the original data we collected, so that we could enhance the trend that could be hidden from the discrete data points; 2) using only the original data to fill the entire picture. The second methods is easier, and it works.
how to generate huge amount of pictures from huge amount of data? (over 600 pictures and thousands and hundreds of data samples)? how to reduce the time to generate an individual picture?
First, we read directly from the .txt files generated from Kinect. We wrote a function to extract lists from .txt file. Second, we optimized the run time of the picture generating algorithm so that generating one single picture only costs less than half a second.
do not fully understand how machine learning works (no one in the world does actually): test multiple combination of data processing, such as linear, quadratic and logarithmic interpolation to see which one works the best for machine learning to distinguish one person from another.
test the accuracy of pictures generated by different interpolation: We devoted hours to get raw data from treadmill, process the raw data into pictures, and train the machine learning algorithm to detect patterns from these pictures, but nothing can be determined until the final result came out.
Accomplishments that we're proud of
Collected accurate data from Kinect sensor by utilizing various techniques such as smoothing out and noise reduction.
Accurately reflected the trend and information of one’s body movement in the picture generated, thus allowing accurate machine learning results.
Utilized sophisticated deep learning module and trained the neural network with a success rate of 86.7%
What we learned
Reading data effectively from Kinect.
How to smooth out the data points with respect to time.
Rapidly reading raw data and generating pictures by using Pillow.
Optimize the big-o for the interpolation functions by using cache-friendly code and optimal algorithm.
Tuning existing machine learning algorithm and model to fit our specific tasks.
What's next for BodyID
Tracking the criminal: An efficient way for identifying some criminal on run through some low quality videos.
Smart Home: In your home, the door will be opened automatically as you walking towards the door.
Office: You will be checked in and verified identity as you walk through the security gate of your office building, saying "Hello Mr./Ms. ___"
We think the application of BodyID is wide because of it's low cost(any camera can easily see a person's figure) and easy implementation.