Panim

Inspiration

It started as an idea for how to save bandwidth for video calls, and then we just kept discovering more and more amazing possibilities that this technology unlocks.

What it does

Panim is a new form of communication. It is a lot more interactive than just a phone call, but not quite as intensive as a video call. An avatar is used to represent you in the call. Your camera is used locally to detect your facial expressions and motion, but the video is not actually sent to your peers, just your face landmarks - 468 points that describe your current expression and position.

How we built it

Face Landmark Detection

We use a pre-trained TensorFlow.js model.

Peer to Peer Communication

The peer to peer communication is done with PeerJS, with connections borkered by PeerServer Cloud. The audio is sent as is, but no video is sent, just the tensor of face landmarks.

Rendering the Avatar

We use WebGL to render the avatar. We bind the chosen avatar image as a texture to the face geometry.

Challenges we ran into

Live online predictions in the browser can be computation-intensive. To address that we chose some of the parameters of the models such as to optimize performance and also optimized how we call the model.
Rendering a face geometry in the browser and binding a texture to it was not straight forward. But we did not give up and achieved what in some late night hours had almost seemed impossible.

Accomplishments that we're proud of

The website is live and operating for anyone to use, and is quite scalable in the number of simultaneous calls, because of it's peer to peer nature.
Rendering the face and binding the texture to it is some advanced level WebGL right there.
Generating avatars online from a snapshot seemed like a far fetched goal for the hackathon, but we made it happen.

What we learned

Everything is possible. Not the first time we learn this. Not the last time as well.

What's next for Panim

New Features:

Hybrid mode: a frame is sent to the peer once every interval (e.g second), so the background of the face is updated every interval and the face itself moves continuously through the landmarks. Closer experience to a video call, but still a lot less bandwidth.
Upload Avatar: upload your avatar from a saved image on your computer.
Moving background: we want to allow to some more of the background in the image used for the avatar to move around with the face, so that the faces also have hair and ears :)