💡 Inspiration
Creating engaging whiteboard animation videos traditionally requires multiple skills - scriptwriting, illustration, voice recording, and video editing. We were inspired to build an AI agent that could handle this entire workflow through simple conversation, making professional animation accessible to everyone.
🎯 What it does
Automation is an AI agent that transforms your ideas into complete whiteboard animation videos through a conversational interface. Users simply describe their concept, and the agent:
- Collects requirements through natural dialogue
- Generates project structure with clear objectives
- Creates story scripts with scenes and narration
- Produces visual assets using Midjourney for illustrations
- Generates AI voiceovers using Minimax audio synthesis
- Creates whiteboard animations from static images
- Composes final videos with synchronized audio and visuals
🛠️ How we built it
The project follows a sophisticated multi-stage pipeline:
- Frontend: Next.js 15 with React Server Components for real-time UI updates
- AI Orchestration: Custom agent workflow using AI SDK for conversation management
- Asset Generation: Integrated Midjourney API for illustrations and Minimax for voice synthesis
- Animation Engine: Custom whiteboard animation system that converts static images to drawing animations
- Video Composition: @diffusionstudio/core for final video assembly
- Database: PostgreSQL with Drizzle ORM for task tracking and asset management
- Storage: AWS S3 for media file storage
- State Management: RxJS for complex async task orchestration
🚧 Challenges we faced
Complex Async Workflow: Managing multiple AI services (Midjourney, Minimax) with different response times and polling requirements. We solved this with a robust task tracking system using database-backed queues.
Asset Synchronization: Ensuring images, audio, and animations are properly synchronized. We implemented a dependency-aware task system where each stage waits for prerequisites.
Real-time Updates: Providing users with live progress updates across multiple generation stages. We used React Server Components with streaming responses and RxJS for reactive state management.
Video Composition: Combining multiple media types (images, audio, animations) into cohesive videos. We built a custom composition engine that handles timing, transitions, and synchronization.
📚 What we learned
- AI Agent Design: How to structure conversational workflows that feel natural while maintaining task focus
- Async Task Orchestration: Managing complex multi-service workflows with proper error handling and retry logic
- Media Processing: Working with different media formats and ensuring cross-platform compatibility
- Real-time UX: Creating engaging user experiences for long-running AI processes
🚀 What's next for Automation
Canvas Virtual File System: Our most exciting upcoming feature - a visual canvas where AI generates and organizes all documents and assets spatially. Instead of traditional file hierarchies, creators will see their projects laid out on an infinite canvas with documents, images, audio files, and videos positioned contextually. This visual workspace will revolutionize how creators interact with AI-generated content, making the creative process more intuitive and collaborative.
Enhanced Animation Styles: Expand beyond whiteboard animations to support multiple visual styles like motion graphics, 2D character animations, and infographic-style videos.
Multi-language Support: Add voice synthesis in multiple languages and automatic script translation to make content globally accessible.
Template Library: Build a collection of pre-designed templates for common use cases like product demos, educational content, and marketing videos.
Collaborative Features: Enable team collaboration with shared projects, review workflows, and brand consistency tools for enterprise users.
Advanced Customization: Allow users to fine-tune animation timing, visual styles, and voice characteristics for more personalized content.
Built With
- drizzle
- nextjs
- supabase
- typescript
- vercel


Log in or sign up for Devpost to join the conversation.