Inspiration

The inspiration came from daily conversations with my friend, who always talks in an accent which I do not understand. As a result, I came up with an idea which allows people to transcribe a video without voice to text merely based on the lip movement. It will also benefit a large community with disabilities in the world for better communications.

What it does

It uses Key Word Spotting to triger a video recorder on mac. After recording and saving the video file, using the machine learning model I trained, it transcribe the lip movements to text without dependencies on any voices.

How we built it

I used two approaches to build the model, but one of them failed. The one that successed uses three layers of Conv3D and maxPooling, two layers of Bidirectional LSTM, one layer of TimeDistributed, and one layer of Dense. For data-processing, I used opencv's library, CascadeClassifier, to detect the region of lip for better performance. For inputing data or interface, I was able to connect opencv with iphone's camera to record videos.

Challenges we ran into

Few of the biggest challenges are not just doing massive amount of research, or trying to connect GPU with tensorflow, or designing models while understanding them, or data-processing, but also involves of frequent ups and downs of emotions.

Accomplishments that we're proud of

I have successfully trained a multi-layer model, ultilize it with different datas, and generating a lot of interesting things and images. Everything almost work!

What we learned

It was very cool to learn how to connect opencv with iphone's camera, and get to know fantastic libraries of opencv. Also, I have learned how to deploy a multi-layer model which involves of computer vision.

What's next for BeQuiet

There are still a lot of improvements what I will have to do with this project. Including re-train the model, and re-write a more concise input data preprocessing. I also want to add more interface and connections between each big block of code.

Share this project:

Updates