Using Google Cloud AI
Pashabook transforms children's drawings into animated video storybooks using Google Cloud AI. Users upload a drawing, and the system generates a 3-page story with illustrations (Gemini 2.0 Flash + Imagen 3), animations (Veo 3.1 Fast + FFmpeg), and narration (Cloud TTS). Built with React Native (Expo) frontend and Node.js backend on Cloud Run, using Firestore for job tracking and Cloud Tasks for async processing. Videos are stored in Cloud Storage with 24-hour TTL. The app supports Japanese and English, features real-time progress tracking, and includes background music mixing. Key learnings: managing Cloud Run timeouts for video processing, implementing efficient polling patterns to prevent API storms, and optimizing parallel processing (narration + animation) to reduce generation time from 3 minutes to ~2 minutes.
Inspiration
Transforming children's imagination into reality and strengthening parent-child bonds through digital storytelling.
What it does
Analyzes children's hand-drawn illustrations and automatically generates narrated animated video storybooks in under 3 minutes.
How we built it
React Native (Expo) + Google Cloud (Gemini 2.5 Flash Image, Cloud TTS, Veo 3.1 Fast) + Cloud Run + Firestore. Gemini interleaved output generates story + illustrations in a single API call.
Challenges we ran into
Imagen 3 quota limitations and migrating to experimental Gemini 2.5 Flash Image. Achieving sub-3-minute generation through parallel processing.
Accomplishments that we're proud of
Implementing Gemini interleaved output as designed. Character-specific voices with BGM mixing. High-speed generation pipeline under 3 minutes.
What we learned
Gemini multimodal API capabilities. Importance of quota management. Prompt engineering for child-appropriate content generation.
What's next for Pashabook
Imagen 3 fallback implementation. Expanding to 5-6 pages. Enhanced multilingual support. Parent library features.

Log in or sign up for Devpost to join the conversation.