Since the pandemic of Covid, people all over the world use more and more online video communication, and more and more kids are following online education. However online video communication often lacks of interaction, listeners and especially kids are likely to lose their attention.

What it does

G’express, short for Gesture Express, aims to bridge the gap between online teaching platform and user’s gesture expression, in order to bring more natural interaction during video communication. With the help of computer vision and deep learning, teacher and student’s gestures will be enhanced by triggering emoji, sound or animation in real-time video stream.

How I built it

We use deep learning models (time convolution encoder + neural networks) to detect gestures from hand skeleton. And then use Python Flask and Javascript to live video stream rendering, in order to add visual and sound effect.

Challenges I ran into

We met challenges of running speed and model accuracy. We managed to find the best balance.

Accomplishments that I'm proud of

We succeed in building up a prototype from an idea, and we integrated AI research and algorithm into a daily application.

What I learned

How to transform AI algorithm to production, how to train siamese networks.

What's next for G'express

Develop more use cases, improve running speed and model accuracy.

Share this project: