HyperCall

View when first opening the site.
Click and Drag to control camera.
Side view generated by depth mapping.
MediaPipe API used for real time inferencing.
WebGL rendering done with Three.js.
WebRTC used for peer to peer video calling functionality.
Link to join call.
Enjoy!

Inspiration

Over the past couple years, we've all spent considerable portions of our lives sitting in Zoom calls. Zoom calls were pretty reliable and they seemingly solved the video conferencing problem, but we felt the experience was always a bit... flat.

What it does

HyperCall is a 3D video calling solution. It is a web application that allows you to call people in 3D space, and share your camera with them. Start a call and have your friend join using the 4 letter access code. Click and Drag to see your friend in all 3 glorious dimensions.

How we built it

We used native browser api's to access the user's webcam. In order to make the faces 3D we created two machine learning models in Tensorflow. Firstly, we made an image segmentation model that separated the user's face from the background. Then, we made another model that determined how far away each point on the face was away from the camera. We combined the results of these two models to create a face mesh that could be rendered using Three.js. In order to get video calling functionality, we implemented WebRTC and hosted a room management server on Google Cloud Run. We used a GitHub repository to host our code and CircleCI for CI/CD.

Challenges we ran into

The depth detecting model isn't perfectly tuned. Occasionally, the algorithm turns the face mesh into some sort of Lovecraftian horror. Our biggest challenge was deploying the app. We needed to Dockerize our code in order to deploy it using CircleCI. The default Node docker image didn't have some of the features that we needed so we had to find workarounds for that. Then when we were able to deploy our server to Google Cloud Run, we ran into cors errors on our client apps. In order to resolve this, we needed to have our frontend and backend running on the same domain. The issue with this is that DNS changes take forever to propagate so we weren't sure if we did something wrong or if we just had to wait a bit longer for everything to start working. We went to sleep with a broken app but by the time we woke up everything had propagated and HyperCall was functional!

Accomplishments that we're proud of

We created a functional app that implemented WebRTC and machine learning models. We deployed this app using scalable technologies on Google Cloud Run and implemented good development practices using CI/CD from CircleCI. We also got a snazzy .tech domain from Domain.com which allowed us to encrypt our video calls and protect our user's data.

What we learned

We learned how to use WebRTC for unintended use cases. We also learned how to deploy machine learning models in real-time on the browser. We enjoyed experimenting with WebGL through the three.js library. This was also our first time using CircleCI so we became for familiar with that technology and it really helped us ensure that our code quality stayed consistent. Finally, we learned how to deploy our containerized application onto Google Cloud Run so we could handle thousands of calls simultaneously.

What's next for HyperCall

Firstly, we want to tune our machine learning models to improve the quality of the 3D effect and potentially avoid some of the more scary results. It would also be cool if a user could combine multiple camera feeds at different angles to better reconstruct their face in 3D. We also want to integrate HyperCall with VR headsets like the Oculus Quest so we can provide an even more immersive experience. Finally, we want to update our WebRTC server and clients to support video calls with more than two users. We currently operate using peer to peer connections which would scale poorly as more users joined a call so we'd need to create a more powerful server that can act as a hub for all the clients.