Children spend the entire day living in a fantasy world they create for their toys. What if we could bring that world to life with a bedtime story as they drift off to sleep?
Research has suggested children that are allowed to engage with their imagination more freely exhibit higher levels of creativity, patience, intelligence, empathy and EQ during adulthood.
What it does
User uploads snaps of their toy, we train a Dreambooth model, generate a suitable story (according to their age) using GPT-3/Cohere and generate illustrations to go with it (Stable Diffusion). Parents can read the story to their child, or their favourite narrator can spark their imagination with UberDuck text-to-speech.
Children can also narrate their own story using AssemblyAI's speech-to-text model and we'll tighten it up, and a plot twist and finish the story off for them.
Bring your toys to life with bedtime stories!
Challenges we ran into
We need some more time to refine our prompts. They are decent but we know they can be much much better with a little more experimentation and fine-tuning.
Cohere's output for the story was unreliable (compared to OpenAI). Sometimes it would output half a story, or two stories or just one line. Really tough to debug given the time constraints. With more time we can train our own Cohere model with inputs from GPT-3 which might perform far superior to standard GPT-3 output.
Rate-limiting from the APIs we were using slowed down some development work. Waiting 20mins for models to train limited what we could show in our video demo.
Accomplishments that we're proud of
A pretty solid MVP despite limited exposure to most of the tools we used! With some small improvements this could actually make a great SaaS business.
What we learned
Training models on Replicate, labelling inputs, prompt engineering, prompt chaining, providing context to Text Completion models and exposure to the general AI landscape (hadn't heard of UberDuck, AssemblyAI, Cohere before)
What's next for Toy Story Creator
We'll spend a few more hours cleaning up the UI (things got very messy towards the end as we rushed to make time) and refining our prompts to produce better quality and more reliable outputs.
We'll finish training our Cohere text completion model (it says 2hrs remaining as of writing) so that we can generate much better stories and implement UberDuck to read the stories too!
If we decide to continue with this project, our roadmap could include:
- Allow children to dictate their own story (using Assembly's speech-to-text model), and use it as a prompt to generate a story along with pictures and illustrations
- Why stop at stories? Use RunwayML to generate animated movies, videos and cartoons
- Allow parents to include certain lessons in stories, for example if a child is being bullied at schools the generated stories for them can be about people that stood up to bullies making the child feel less isolated and providing them with techniques to deal with the real world
- Create and share stories with your friends... Maybe even earn from others reading your stories
Log in or sign up for Devpost to join the conversation.