Inspiration

Every child loves hearing a story where they are the hero. But personalized, educational storybooks are often expensive, hard to find, or time-consuming to create.

Teachers and parents also struggle to explain complex topics—like how digestion works or why we feel angry—in a way that is engaging for a young child.

We asked ourselves: what if any parent could create a fully personalized, illustrated, narrated storybook in under 60 seconds?

That question became StoryPal.


What It Does

StoryPal is an AI-powered web app that generates personalized educational storybooks for children aged 3–12.

You enter your child's name, age, appearance, and choose a topic. Within a minute, you get:

  • A multi-page story where your child is the main character
  • Unique watercolor-style illustrations for every page
  • Voice narration read aloud page by page
  • "Did you know?" fun fact boxes
  • An interactive quiz to reinforce learning
  • A downloadable PDF to print and keep

Topics span 35+ curated subjects including Science, Health, Emotions, Life Skills, and History—plus unlimited custom topics.


How We Built It

StoryPal uses a three-stage AI pipeline:

1. Story Generation We use Groq (LLaMA 3.3 70B) to generate a structured JSON story (title, pages, facts, quiz) in ~3 seconds. Prompts dynamically inject the child’s details into the narrative.

2. Image Generation Each page is converted into an image prompt and sent to Cloudflare Workers AI (SDXL Lightning), optimized for a consistent watercolor children's book style.

3. Voice Narration We integrated Kokoro-82M (ONNX) using kokoro-js. Text is chunked into sentence-level segments, converted to audio, and merged into seamless page narration.

Frontend: React + Vite + Tailwind CSS UI/UX: Framer Motion + shadcn/ui Storage: IndexedDB PDF Export: jsPDF


Challenges We Faced

  • TTS Chunking: Kokoro has strict input limits. We implemented sentence-aware chunking and precise WAV concatenation without audible gaps.

  • Consistent Image Style: SDXL initially produced inconsistent outputs. We refined prompts heavily to maintain a uniform watercolor aesthetic.

  • JSON Reliability: LLM responses were sometimes malformed. We added schema validation and retry logic.

  • No Backend Storage: We used Base64 encoding + IndexedDB to persist full stories (text, image, audio) while avoiding quota issues.


What We Learned

  • Groq’s inference speed enables real-time storytelling UX.
  • TTS architecture decisions depend on latency vs. model size tradeoffs.
  • Accessibility (WCAG, touch targets, keyboard nav) is essential in children’s apps from day one.

What's Next

  • Multi-language support (story + narration)
  • Shareable story links
  • Teacher dashboard for classrooms
  • Persistent character memory across stories

Built With

Share this project:

Updates