Inspiration
We were inspired by the potential to help parents provide educational content and narrate bed time stories for their children.
What it does
V.I.N.E is a generative model that takes user's desire genre of story and then creates, and narrates the story in an interactive format through images and a user-chosen narrator.
How we built it
We utilized Mircrosoft's Azure Speech Recognition to take user's input and then utilizes Stable Diffusion AI's API and ElevenLabs API to to generate relevant images and user-chosen voice from content generated directly from a large language model.
Challenges we ran into
We faced implementation issues as there was a lack of synchronization between multiple calls. We also struggled to find great open source repositories that could help us achieve our desired goals.
Accomplishments that we're proud of
We're proud to have implemented a product that is able to solve a genuine problem that is relevant to people in our society.
What we learned
Our biggest takeaway was that great things are possible when we're motivated by a desire to create something new and exciting.
What's next for V.I.N.E (Voice &Image Narrative Expert)
We're going to implement multiple character and make the user interface more interactive through subtitles and provide a real-life experience.
Log in or sign up for Devpost to join the conversation.