Shoulder Surfer

Inspiration

It's often the case that security agencies need access to devices (mobile phones particularly) but the password is unavailable. What if we capture a video of the person unlocking the phone before confiscating it? We plan to use this video feed to decipher the password to the device.

What it does

By analysing the thumb movement used to unlock the phone, we attempt to figure out the password of the mobile device.

How we built it

The first step was to gather the dataset for this problem. We captured two videos- One of a person typing his/her password into the phone and second, a screen recording of the mobile phone with the keys being pressed on the lock screen. These two videos had different rates of frames per second so one of the videos had to be scaled down. Next, we built a convolutional neural network by combining these two video feeds. Basically, X_train for the neural network was the video containing a person's thumb movements and the y_train was the video containing the true values of the passcode.

This model returns the probability distribution of the numbers from 0 to 9. This distribution is then fed into another program, which predicts the likelihood of a password based on existing password data. Basically, it tries to predict if 1-2-3 is more likely to be followed by 4 or 0. We know that all passwords are not likely to follow these patterns but having a probability distribution of several 'nearby' possibilities is helpful to boost our predictive ability. The output from this final step would be a list of password values, with different probability values.

Lastly, a visualization is shown with probability distributions overlaid on top of each other.

Challenges we ran into

There was no available dataset to train our model so we had to generate the data ourselves by taking multiple videos.
It was challenging to assign a weighing measure between the pin probabilities generated by the neural network and the pin probabilities generated from the real world data.
It was challenging to run the state of the art neural networks given the limited capacity of our laptops and we had to decide which important portions of the networks to actually use.

Accomplishments that we're proud of

Significant amount of time was spent into generating the dataset for building this project. Data is almost lways difficult to gather, be it actual availability issues or permission issues. We figured out a way to generate our own dataset to train the machine learning model.

Combining 2 different videos (recorded at different frames per second) to train the model is something that we're really proud of.

Building the entire end to end system, starting from video capturing to visualization of probably pin possiblities was extremely rewarding.

What we learned

Data collection techniques
Buidling image classifier from multiple video feeds
Combining results from different methods like neural network and password dumps.
Buidling end to end systems