Inspiration

One day I was trying to be ready for a date, I look at myself in the mirror and asked, what do I look like when I speak? What kind of expression that I'm making? How can I make sure that I don't scare my date off if my facial expression is too freaky? I told myself I should build something that can help me. So here I'm building an AI to help me make no mistakes.

What it does

This AI can take both Image and Video of people's faces. It gives results in 7 types of emotions (angry, disgust, fear, happy, sad, surprise, neutral). It output the percentage of what the AI thinks the face is representing. With the live feed version, you can view yourself in real-time to see what face you are making.

How I built it

In the beginning, I need to find a good dataset for this task and then I find the perfect one. There was a challenge in 2013 that provides a good dataset with all labeled faces. The dataset called fer2013.

After understanding what is in the dataset by some NumPy coding, I have an idea of how to build this model. So I set up the training and testing data, build a 3 layer CNN network and connected them to output a model. Because I'm using a very good GPU(RTX 2080), I can set the batch size quite large to speed up the training.

After running it 40 epochs, with each epoch only taking 5-6 seconds. I achieved a 98% accuracy within the trainset and 58.4% on the testing set. It will put me quite high on the leaderboard if I was attending the challenges, So I'm quite happy with it.

With the model been trained and saved, I started to test what it can do. I find some images of myself and also find some interesting images that my friend sent me. After feeding them to the model, the output is quite well. I have to agree with the AI on most of them. See the Images I updated, What do you think?

Now, I have a good model, I want to build a live version of it. I should be able to get video from the webcam and give results in real-time. After playing with some OpenCV's code, I able to get the video from the webcam, after resizing them and local where the face is using some public OpenCV's Database(haarcascade_frontalface_default.xml). I was able to get the image of a face, fed to the model and have the outcome displayed on the image and also on the side. And now, this little project is complete.

Challenges I ran into

The first challenge I face is being abandoned by my teammate twice in a role. No hard feelings but it are been really difficult to find teammates no matter how many times I tried. All I want to do is something related to Ai or hardware. But look like most people who interested already have a team and my planned teammate wants to work on something else. After many tries of team building, I still found myself alone.

Also because the hardware lab doesn't have webcams ready, I have to go out to the micro center to buy a webcam for $60. Ouch.

During the training, I had some challenges: The biggest one is Tensorflow, without a GPU is hard to work with. I tried many times on my laptop to see if it can train, not only one epoch takes 1000 seconds to finish but also the laptop is running out of RAM all the time. So I decided training it on my gaming desktop, which has an I7 7700k, an RTX 2080, and 16 GB of RAM. That was a good call which saved me so much time. After the training, I also had some problems with how to locate the face. I was thinking about building another CNN to recognizing faces but I don't think I want to do that. After lots of googling, I found a fast way of doing it, which is using OpenCV's database to select the face and that solved my problems.

Accomplishments that I'm proud of

I'm proud that I made something useful! Not everything went as plan, but I still able to pull it off. I enjoy this one a lot.

What I learned

First, build teams early! Second, use the right tool for the right thing! Third, keep trying does not hurt and sometimes it gives me what I want!

What's next for Emotion Recognizer

I want to build it into an APP or a website. So people can upload their images to test it out. For the live feed version, I think it can be a great help to acting, speeching, or having a date. So I want to build an NLP (Natural language Processing) system with it that also can tell the emotion of one's voices. You can practice to the webcam and get a report of how you did. I think many people would love it. I also think it can be used to classify videos online, which can assign tags to videos that have that kind of emotion, which can be used to help people found what they need.

Built With

Share this project:

Updates