Inspiration

The inspiration was elevenLabs voice generation. Someday all 3D models will have AI generated voices that are automatically lipsynced.

What it does

Currently, a simple 3d model with a baked in animation starts animating when the user clicks speak.

How we built it

I built it with the in built browser speech api, BabylonJS, and blender.

Challenges we ran into

The speech api does not seem to fire when speaking ends reliably.

Accomplishments that we're proud of

I'm happy that I was even able to get animated 3D model into the browser.

What we learned

Mostly that every image to 3d library like Shap-E does not work well yet.

What's next for Talking Head

Splitting the text by phoneme and accurately syncing the mouth animation to the text.

Built With

  • babylonjs
  • chatgpt
Share this project:

Updates