Inspiration
We loved picture books as children and making up our own stories. Now, it's easier than ever to write these stories into a book using AI!
What it does
- Helps children write their own stories
- Illustrates stories for children
- Collects a child's story into a picture book, sharable to their friends and family
- Use the emotion of your voice to guide the story
How we built it
We used
- React for the UI (display and state management)
- hume.ai to facilitate the end-to-end conversation
- DALL-E to illustrate stories
- Firebase for saving stories
Challenges we ran into
1. Expiring Image URLs
The format of the OpenAI DALL-E API's response is an image URL. We encountered two challenges with this URL: latency and expiration. First, the response took up to five seconds to load the image. Second, the images expired after a set number of hours, becoming inaccessible and broken on our site.
To solve this challenge, we downloaded the image and re-uploaded it to Firebase storage, replacing the stored image URL with the Firebase URL. This was not possible on our existing frontend due to CORS, so we wrote a node backend to perform this processing.
2. Sensitive Diffusion Model Prompts
Initially, we directly used the generated story text of each page as the prompt to DALL-E, the diffusion model we used for illustrations. The generated images were extremely low quality and oftentimes did not match the prompt at all.
We suspect the reason is that diffusion models are trained quite differently from transformers. While transformers are pre-trained on the next token prediction task with very long text sequences, diffusion models are trained to accept a shorter prompts with more modifying attributes.
Therefore, we added a preprocessing step that extracts a five-word summary of the prompt and a list of five attributes. This step dramatically improved the quality of output illustrations.
Accomplishments that we're proud of
- An end-to-end loop to create a story book by speaking the story aloud
- An aesthetic interface for viewing finished story books
What we learned
- AI prompts are very sensitive (esp. prompts for diffusion models). Even the difference of a single word can drastically change the output!
What's next for StoryBook AI
To focus on the core functionality of the app, we omitted several things that we would want to build after the hackathon:
- User accounts. Users should be able to create accounts, possibly with linked parental accounts also for parents to view stories their children made.
- Difficulty settings. Our goal is to improve the creativity and storytelling abilities of children. Younger children may need more assistance with difficulty while older children may focus on more complex literary elements such as plot and character development. We would like to tailor the questions the AI raises to each child's ability.
- Customization. Users should be able to customize the look and feel of their own stories including themes and special styles.
Built With
- firebase
- hume.ai
- react
- typescript
Log in or sign up for Devpost to join the conversation.