V.I.N.E (Voice &Image Narrative Expert)

logo
interface

Inspiration

We were inspired by the potential to help parents provide educational content and narrate bed time stories for their children.

What it does

V.I.N.E is a generative model that takes user's desire genre of story and then creates, and narrates the story in an interactive format through images and a user-chosen narrator.

How we built it

We utilized Mircrosoft's Azure Speech Recognition to take user's input and then utilizes Stable Diffusion AI's API and ElevenLabs API to to generate relevant images and user-chosen voice from content generated directly from a large language model.

Challenges we ran into

We faced implementation issues as there was a lack of synchronization between multiple calls. We also struggled to find great open source repositories that could help us achieve our desired goals.