Inspiration
The inspiration was elevenLabs voice generation. Someday all 3D models will have AI generated voices that are automatically lipsynced.
What it does
Currently, a simple 3d model with a baked in animation starts animating when the user clicks speak.
How we built it
I built it with the in built browser speech api, BabylonJS, and blender.
Challenges we ran into
The speech api does not seem to fire when speaking ends reliably.
Accomplishments that we're proud of
I'm happy that I was even able to get animated 3D model into the browser.
What we learned
Mostly that every image to 3d library like Shap-E does not work well yet.
What's next for Talking Head
Splitting the text by phoneme and accurately syncing the mouth animation to the text.
Built With
- babylonjs
- chatgpt
Log in or sign up for Devpost to join the conversation.