Inspiration
Hi! We’re Jenny, Emma, Alice, and YJ, sophomores at Stanford University passionate about education and accessibility. We powered through this hackathon through Tejava Teas, Matcha Pocky, and Spotify Blends.
What it does
The COVID-19 pandemic had widespread impacts on human interaction, especially for young children. Research indicates that the pandemic's restrictions, such as virtual schooling, have significantly affected children's ability to learn social cues from external stimuli. This disruption in a child’s vital developmental process highlights the urgent need for innovative solutions to support children's emotional and social learning.
We wanted to create an engaging education tool children would show a genuine interest in. Sentiment Storybook is envisioned to not only leverage technology to bridge the learning gap in a nurturing and accessible manner but also facilitate a deeper understanding among parents of their children’s needs while fostering parent-child quality time. We sought to teach children how to navigate situations that require different responses in a fun and engaging way.
Functionality
We introduce Sentiment Storybook, an interactive digital storybook that invites parents to embark on a journey of emotional learning alongside their children. Here's how it works: parents and children begin by selecting a core value or principle they wish to explore together — whether it be respect, compassion, honesty, or any other virtue pivotal to emotional and ethical growth.
Using a fine-tuned Mistral LLM and Stable Diffusion technologies, Sentiment Storybook generates a unique picture book illustrating a scenario demonstrating how to practice this value. The scenarios feature colorful drawings that users can navigate through to simulate the experience of reading a physical picture book. The accompanying storyline provides context and depth to the illustrated scenarios. It's designed to prompt discussions between parents and children, encouraging them to reflect on the story, relate it to their own lives, and consider how they can apply the learned values in their daily interactions.
What’s more — Sentiment Storybook enables parents and children to experience a unique story each time, ensuring a fresh learning adventure every time. Furthermore, children get the opportunity to reflect on diverse scenarios, deepening their understanding.
How we built it
On the high level, we hoped to pass in a value (such as “kindness”) as an input into our web application and receive a sequence of 4-6 illustrations, accompanied by text, to demonstrate that value.
To generate the picturebook, we first generated 4-6 story sentences from the value, as well as the illustration descriptions corresponding to each sentence in the story. This required building an API endpoint to Monster API’s Llama2-7B-Chat model and several iterations of prompt engineering to receive our desired format of sentences and illustration descriptions. After processing the response to build a list of the illustration descriptions, we passed them into Monster API’s sdxl-base model (with another API endpoint) to generate our illustrations.
Next, we uploaded the values, story sentences, and illustrations to Convex’s cloud-hosted database. We chose Convex to incorporate real time updates on the frontend, so that picturebooks would generate and immediately populate the frontend upon user input. We implemented the above generation process through Typescript as Convex actions and saved to our database with Convex mutations.
Finally, for the front-end, we designed our picture book through Figma and built it through ReactJS + Tailwind.
Note: Initially, we hoped to build our own LLM model for generating a story and illustrations by fine-tuning with the MonsterAPI. To do so, we first created our own custom dataset from the values as our input and the story as our output with GPT-4 and careful prompt engineering. With this dataset, we fine-tuned the Mistral 7B language model with MonsterAPI to generate our 4-6 step story sentences and illustration descriptions. Finally, we employed stable diffusion XL with MonsterAPI, passing in the illustration descriptions to generate our picture book illustrations!
We successfully deployed the model and hit the endpoint to receive our picture book. However, our deployed model was terminated unexpectedly (potentially due to depleting our tokens), so we reverted to our original model for generation. In future steps, we hope to continue exploring an even more robust fine-tuned model!
Challenges we ran into
Our project involved several key phases and challenges, each requiring its own set of innovative solutions and collaborative efforts:
- Exploring New APIs: Our team delved into uncharted waters by integrating Convex DB and Monster API into our workflow. This introduced a steep learning curve for our team as we navigated the intricacies of these tools; however, we persevered until the end.
- Prompt Engineering Iterations: We began our journey by testing basic prompts with GPT-4; however, when we transitioned to open-source LLMs on Monster API, we encountered diverse outcomes. Many generated stories were unrelated to our intended content, lacking the desired structure. To address this, we iteratively refined our prompts, assessing the ideal balance of specificity and structure to elicit descriptive narratives effectively. We also fine-tuned and deployed Mixtral-8x7B-Instruct-v0.1 LLM for this purpose.
- Stable Diffusion Challenges: Attempting to leverage Stable Diffusion XL for image generation, we faced a lot of hurdles. For instance, inputting a prompt for a scene featuring both a turtle and a frog resulted in the model generating only one of the specified animals. To overcome such obstacles, we undertook a rigorous process of hyperparameter tuning and adapted our prompt structuring methods to enhance model comprehension.
- Custom Dataset Generation: Recognizing the importance of fine-tuning models for our specific requirements, we embarked on custom dataset generation. This process aimed to optimize model performance and accuracy for our application.
- Overcoming Deployment Hurdles: During the deployment phase, we encountered obstacles in fine-tuning and deploying models using Monster API. However, with invaluable assistance from the dedicated team at MonsterAPI—whose round-the-clock support on Discord proved instrumental—we successfully deployed a fine-tuned model capable of consistently generating desired outputs.
- Other Unexpected Results: Throughout our journey, we encountered unexpected results stemming from various factors such as NSFW issues and the necessity of including character names for accurate image descriptions. Additionally, we had to devise strategies for processing data with varying formats, ensuring consistency to facilitate smooth operation.
Accomplishments that we're proud of
As students passionate about education, it was incredible to work towards solving an issue we all care about. Especially since some of us have had the opportunity to teach and interact with young kids, we kept them in mind as we built our product, which made the process so much more rewarding.
We are also proud that we built a generative picturebook from scratch! We had little experience with the APIs, platforms, and fine-tuning LLMs, but through scrutinizing the documentation and running back and forth from Gates to Huang to seek help from mentors, we were able to pull all the pieces together. Especially when merging the pieces from the generation process to the database to the frontend, we were shocked and overjoyed that it worked.
What we learned
On the technical side, we learned how to work with completely new platforms and APIs such as Convex, MonsterAPI, and Hugging Face. We also struggled with creating a custom dataset from scratch and prompt engineering to receive our desired output formats, but we triumphed after much trial and error.
Finally, many of us experienced our first hackathon! From idea brainstorming to debugging and more debugging to building our final product, we learned how to work as a team and organize our priorities. We delegated tasks according to each teammate’s passion or experience level and planned out our time strategically to ensure we had a MVP up and running first.
What's next for Sentiment Storybook
The following are a few of the next steps that we hope to take in the near future.
- Adding more interactive features, such as offering several options for the storyline or presenting positive and negative examples for the kids to choose between.
- Fine-tuning SDXL with Monster API once it’s released (or HuggingFace) so that we can get better, more consistent images for our storybook.
- Continuing to fine-tune the LLM to generate longer, more comprehensive stories for older age-groups. We can also have age-group as another variable.
- Applicable beyond just learning core values/principles. It can be stories that teach more nuanced morals – eg. importance of healthy lifestyle
Built With
- convex
- mistral
- monsterapi
- react
Log in or sign up for Devpost to join the conversation.