Inspiration

We set out to bridge the gap between static manga panels and animated storytelling. While anime production is expensive and time-consuming, many creators and fans dream of seeing their work come to life. With recent advances in multimodal AI, we saw the opportunity to build a tool that could bring manga to motion instantly.

What it does

Animefy.io transforms manga images into short anime-style video clips. The platform automatically understands the visual content of a manga panel, generates a descriptive scene, and then uses generative models to animate the prompt. The result is a seamless pipeline that brings drawings to life with minimal user input.

How we built it

Backend: • Built with Flask and served through NGROK for testing • Uses GLM-4V (Vision Language Model) to interpret manga panels into text descriptions • Uses a text-to-video diffusion model (cerspense/zeroscope_v2_576w) to generate animated video frames • Converts video frames into a downloadable MP4 file

Frontend: • Built with React and styled using Tailwind CSS • Allows users to upload manga panels, view results, and download videos • Clean and responsive UI for both desktop and mobile

Pipeline: 1. Manga panel image is uploaded 2. GLM-4V generates a scene prompt 3. The prompt is passed into the diffusion model 4. Generated frames are exported as a video and returned to the user

Challenges we ran into

• Integrating two large models with distinct modalities (image-to-text and text-to-video)
• Managing VRAM and generation time in a Colab-based backend
• Ensuring high-quality video output with limited resolution and frame count
• Handling serialization and dependency issues with diffusers and PyTorch models
• Creating a smooth developer experience across Flask and React

Accomplishments that we're proud of

• Built a fully functional end-to-end system that converts manga panels into animated video
• Integrated GLM-4V and ZeroScope into a real-time generation pipeline
• Created a usable and attractive frontend with full upload-to-video delivery
• Successfully generated compelling visual content from static input in under a minute

What we learned

• Deepened understanding of multimodal pipelines combining vision, language, and diffusion
• Learned to fine-tune prompts for optimal video generation results
• Gained practical experience in handling large models on constrained environments
• Improved full-stack deployment skills with Flask, React, and NGROK

What's next for Animefy.io

• Add audio generation, including voice and background music
• Support multi-panel sequences for richer storytelling
• Enable user-editable prompts and character customization
• Deploy on GPU-backed cloud infrastructure for public access
• Add login, saved scenes, and community sharing features

Built With

Share this project:

Updates