I wanted to explore Alexa skills development because I have never programmed it before but the problem is that there are already tons of skills out there. What possibly can I do to make my skill different? Then I realized that there are not a lot of voice-controlled system surveillance skills out there.

What it does

A script on your system run in the background and allows you to interface with your system using Alexa. I used ngrok to route localhost to a publically accessible IP. You can issue commands to your system like: "Shutdown", "Sleep" and "Logout". This is not the fun part. The fun part is that the webcam takes periodic pictures and does image recognition on them.

I use an undocumented Facebook image recognition API to recognize the people in the webcam pictures. This is much faster and accurate than building my own image recognition algorithm. Sometimes Facebook is unable to recognize people in the pictures so as a failsafe I have also made use of Google Cloud Vision API to label the webcam images.

How I built it

I used Python to write custom scripts for AWS-lambda. I implemented the localhost server using Python Flask and ngrok for interfacing with localhost over internet. I used to save webcam images online so that I can use them with Facebook.

Challenges I ran into

  • Understanding how Alexa works. This was literally my first time using Alexa.
  • Interfacing with localhost on AWS-Lambda
  • Reducing the processing time on server because otherwise Alexa gives an error
  • Finding and Working with a free image uploading API

Accomplishments that I'm proud of

  • Successfully creating an Alexa skill
  • Successfully reverse engineering the Facebook image recognition API

What's next for System Monitor

  • Buying an Alexa for myself for future testing :P
  • Adding more commands
Share this project: