Inspiration
Many young kids find reading tedious, lonely, or difficult to stay engaged with, especially at bedtime. We wanted to build something that transforms reading into a moment of joy, creativity, and comfort, especially when caretakers may be too exhausted from a long day of work to read their kid a story during bedtime. We also wanted to make a creative way for kids to engage with their pictures by turning drawings into stories that they can listen to or read along with.
What it does
Draw My Story transforms a child’s drawing into a fully generated, narrated story experience based on an uploaded image of the child's choosing.
How we built it
We built Draw My Story, an AI-based children’s story app that creates a story from a drawing, with a highly sophisticated tech stack. To build Draw My Story, we designed a robust, full-stack system designed to transform a child's imagination into a book that comes to life, with Text-to-Speech read-along and stunning AI-generated visuals. The frontend leverages React 18 for a responsive and beautiful web interface that embodies the essence of a child’s imagination. ElevenLabs API was used to power expressive and emotional text-to-speech, and Gemini 2.5 Flash to interpret and generate text, and Gemini Nano Banana to generate images. At the heart of the application, a Node.js and Express backend powers the project, managing secure JWT authentication. This is linked to a database powered by Snowflake and Digital Ocean for its Content Delivery Network, allowing users to maintain their account through a login and be able to access previous stories persistently with our “My Library” feature. Lastly, we used Git for version control and collaboration, Cursor for code-editing, Claude for learning along the way, and Canva for planning. Overall, Draw My Story leverages multiple high-level APIs, frameworks, technologies, and thoughtful planning to deliver a powerful and sophisticated tech stack that brings the imagination of children to life.
Challenges we ran into
- Troubleshooting through API keys - With many projects in Google Gemini, it was tough to find the right API key that could enable us to integrate higher quality text generations for our stories.
- Database management - Using DigitalOcean and Snowflake posed technical challenges such as memory persistence based on user accounts and ensuring reliable and efficient media storage.
- It was the first time for all of our team to use git on a project with multiple people actively coding at the same time. We faced many issues with merging branches and making changes while other members were on previous version.
- Optimizing our loading times. Originally our loading times took over 2 minutes but after hours of optimization we were able to cut it down to 45 second. We did this by doing the api calls in parallel.
Accomplishments that we're proud of
- Learning how to navigate Git commands faster, more effeminately, and fixing conflicts
- Creating a dreamy frontend that integrates well with our backend, that has databases, credentials, and was optimized.
- Segmenting our project in a way that allowed all of us to contribute to frontend and backend in different roles
- Most importantly creating a project that can serve kids from all over the world and making their imagination come to life
What we learned
- How to use a complicated stack using databases, deploying to web servers. How to use cloud servers like digital ocean and snowflake for persistent storage
- Team based programing and how to plan, communicate, and designate tasks. With ai agent accelerated workflows
- How to combine backend v.s. frontend especially on a large codebase
What's next for Draw My Story
For future extensions of our project, we seek to incorporate multiple languages so that:
- Kids seeking to learn a second language can listen to and read along with the story narrator while having real-time access to a direct, native language translation of the foreign text that they are learning; we hope to have a translation in the kids' native tongue so that they can strengthen their foreign language comprehension by reinforcing translations between foreign and native tongues and building off of prior language knowledge. Which would build on our existing multi language layer
- Expanding it for older kids such as teens practicing for language exams can get customized content that is displayed in both their native language and foreign language of study. This would grant teens the opportunity to gain greater exposure to variety in the tonal inflections in the foreign language that they are studying. We hope that our platform can support kids and teens who may not have regular access to native speakers (i.e. kids learning English in communities without many English speakers, teens studying Korean in communities with few Korean speakers beside their teachers at school during limited school hours, etc.).
Built With
- bcrypt
- css
- digitalocean
- elevenlabs
- express.js
- gemini
- javascript
- jest
- jwt
- node.js
- nodemon
- react
- router
- snowflake
- supertest
Log in or sign up for Devpost to join the conversation.