Inspiration
The power of AI has been integrated with several fields now - health, education, finance, social networks, etc. However, the field of surveillance has remained complacent for a long time. Surveillance cameras are all around us - schools, colleges, hospitals, shops, offices - just to name a few. Yet, security authorities still manually monitor multiple screens to detect if something is fishy. Why does it have to be so tedious?
It is now time to disrupt this. The video surveillance market is currently a $16 billion industry and is expected to exceed $30 billion by 2019. If this hack sees the light of day, it can potentially penetrate the global video surveillance market. Imagine a world where accidents, natural disasters, gun violence, robbery, terrorism, etc. are detected automatically. With this hack, we are one step closer to that utopian world.
What it does
Batman-Scan is a real time surveillance monitoring system which is constantly on the hunt for suspicious activities. If something abnormal is detected, the concerned authorities are notified immediately. Local police can also monitor the city through a map interface in real time (just like batman sees the sky!). When an unusual behaviour is detected, a distress bat-signal appears on the map.
How I built it
This hack is neatly divided into three sections :
- The "brain" which detects the abnormality in the live video.
- Nexmo's text-to-speech API which is used to deliver the information accurately over a call. The bot would furnish all the details by voice. Eg : "Suspicious activity detected at University of Pennsylvania, Philadelphia on 10th Sept 2016 at 10:56PM". If the user fails to pick up the call, a text message is sent which contains all the details.
- A Google maps interface which renders the distress signal in real time. The location displayed is the location of the camera.
Unlike exploiting traditional machine learning algorithms like neural nets which process the data in batches, I employ a simple statistical outlier classifier which makes real time decisions. The following figure helps you visualize this. The beauty of this approach is that you DO NOT need training data at all. If a pattern occurs frequently enough, it is regarded as a "normal" behaviour.

Challenges I ran into
Let's acknowledge the fact that working with videos is painful. When I started out, each frame had about 6 million features! (and considering that videos are recorded in 30 fps, each second of the video would have been represented by about 180 million features!) However after numerous attempts, I could bring down the feature size to around 30,000.
In order to analyze historical data, it is usually stored in some form. But storing all features of a video is expensive and time consuming. To overcome this problem in this hack, the surveillance data is represented in the form of their densities rather than the data itself.
Dataset used for video surveillance
Accomplishments that I'm proud of
- Ensuring that the model makes accurate predictions from the input video stream in real time. (The faster the abnormality is detected, the sooner can the police take an action!)
What's next for Batman Scan
- It might be useful if we could localize the abnormality within the frame. (rather than marking the entire frame as abnormal)
- The power of GPU can be leveraged to process frames faster.
- Minimizing the false positives.
- Rather than a single surveillance camera making independent decisions, it might be more efficient if the cameras learn together as a swarm. (Just like how there exists a "single brain" for all of Tesla's autonomous cars)
Built With
- google-maps
- nexmo
- pyth
- statistics
Log in or sign up for Devpost to join the conversation.