Inspiration

Person Of Interest is a dystopian sci-fi TV programme. The premise is that after 9/11, the government hire a company to build an artificial super-intelligence capable of predicting terrorist attacks before they happen. The AI, known as "The Machine", uses the government CCTV feeds (that show everything happening everywhere in the US) and identifies suspicious actions using advanced reasoning. The creator of the machine then secretly modifies it so that it notices all kinds of crimes, and teaches it to ring a nearby payphone to notify him of this, so that the crime can be stopped beforehand.

What it does

One laptop hosts a website that simulates the payphone. The second laptop has a program on it that analyses its camera footage. When a banana is spotted by the second laptop, it reports this as a violent crime, and causes the "phone" on the first laptop to ring. When picked up, the first laptop's website shows the processed footage. The processed footage identifies the innocents (those not currently holding a banana) and the perpetrator (the one currently holding a banana).

How we built it

Mabel did the front-end website, and Oliver designed the Machine Learning in the backend. The backend analyses footage from the camera, and uses a semantic segmentation U-NET (which Oliver trained with his own synthetic data) to identify pixel-wise banananess. If a large clump of pixels that are classified as a banana is detected, we use OpenPose's pre-trained model to do pose estimation of everyone in the footage, and identifies which hand from which person is closest to the banana (to work out who is holding it.) We then work out where the head is on the skeleton containing the closest arm, and draw a bounding box around it. Every skeleton gets a bounding box around its head, but only the one holding a banana has a red one (to make it clear they are the perpetrator). We then take the bounding-box only image, the image with annotated pose estimation, and the cropped head of the perpetrator, and send all these images to the server (which is run with flask). The first time a banana is spotted, the server's "phone" rings, and when "picked up", it shows the real-time feed of these images.

Challenges we ran into

In Person of Interest, The Machine uses a computer the size of a massive underground warehouse, and we only had a pair of laptops with no external GPU. This made it very hard to recreate. Instead, we decided that while it would be extremely difficult to predict something happening without advanced reasoning (we realised while we were making it that OpenAI no longer allows access to the API with the free tier) it would be much easier to identify crimes at the time they are happening. However, (for some reason), we were not allowed to bring a gun into the computer lab, so we made do with a banana to stand in for a gun. Unfortunately, on the second day they ran out of bananas, so instead we used yellow-coloured cardboard as a standin for that. Also, it was very difficult to make this project work in real-time, given that everything was run on the CPU, so quite a lot of time was spent balancing speed and accuracy.

Accomplishments that we're proud of

We worked on the server and backend completely separately, and were surprisingly able to connect them in under half an hour. This is because we had both written convenient APIs to interact with both front-end and back-end. We were also proud of how there are very few false positives in the banana detection, minimizing unnecessary calls. We were also proud of being able to train the U-NET entirely from scratch on a synthetic dataset, and how the synthetic dataset followed a very similar data distribution to reality.

What we learned

Oliver learned how to use OpenPose in pytorch. Mabel learned how to use flask with her webserver.

What's next for Person Of Interest Machine

In future, the project could be refined by enforcing temporal consistency, as it has a tendency to temporarily forget who the person holding the banana is, and then go back to the correct one. In addition, we would try to identify banana-based violence before it actually occurs, which is more in keeping with the original TV show, e.g. by seeing if someone is malevolently approaching a banana.

Built With

Share this project:

Updates