Inspiration
As someone who likes to write, I’ve always wanted to bring my stories to life visually. I’ve seen incredible work on YouTube and tried to replicate it, but the process was grueling. I would spend hours prompting for individual scenes with Nano Banana Pro while struggling to maintain consistency, paying for expensive external voice tools like ElevenLabs, and then manually stitching everything together in CapCut. It was so tedious and time-consuming that I often gave up halfway through. I created Lorecast to collapse that entire workflow into a single, automated experience.
What it does
Lorecast is an AI-powered cinematic suite that transforms raw manuscripts into consistent, narrated animatics in minutes. It doesn't just generate random images; it builds a persistent "World Bible" to ensure your characters and environments stay visually accurate across every frame. Lorecast partitions your script, assigns emotional "Director's Briefs" for vocal performance and lighting, and uses a built-in video engine to export a finished MP4—moving you from "Idea" to "Film" without ever leaving the app.
How we built it
The application is a high-fidelity React system leveraging the Gemini 3 family.
- Manuscript Reasoning: We used Gemini 3 Flash to extract complex visual entities and maintain a 1M+ context window for story continuity.
- Nano Banana Integration: For high-fidelity image generation, we implemented localized Paint-to-Edit controls to allow for precise scene refinement.
- Cinematic Audio: We utilized the Gemini TTS capabilities with custom emotional performance tags to generate high-quality, narrated dialogue.
- Browser-Side Rendering: We integrated FFmpeg.wasm to handle the final video assembly directly in the user's browser, ensuring privacy and extreme speed.
Challenges we ran into
- The Sandbox Security Wall: Modern browsers block Web Workers from cross-origin CDN URLs. This initially broke our FFmpeg video export in sandboxed environments. I solved this by implementing a "Blobify & Rewire" strategy: manually fetching scripts as text, using regex to fix internal imports, and converting them into local Blob URLs.
- Character Drift: Preventing the AI from "forgetting" character details required deep prompt engineering and a custom reference-injection system that feeds "World Bible" assets into every new generation call.
- Quota Management: Managing the high multimodal demand of simultaneous image, audio, and text generation required building a robust exponential backoff system to handle rate limits.
Accomplishments that we're proud of
We are incredibly proud of our Zero-Backend Video Pipeline. Being able to take a raw text file and generate a fully narrated, visually consistent MP4 file entirely within a browser sandbox is a massive technical milestone. We also succeeded in creating a "World Bible" system that actually works, maintaining specific character traits like Chidi’s respirator and the Dimension Lurker's glowing pulse across different camera angles.
What we learned
Building Lorecast taught us that "Consistency is King" in AI storytelling. We learned that the future of creative tools isn't just about making one "cool" image, but about maintaining a logical, visual thread across a long-running narrative. We also realized that browser-side tools like WebAssembly (Wasm) are the secret to building powerful, private creative suites that don't rely on expensive server infrastructure.
What's next for Lorecast
Lorecast is just the beginning of the automated cinema revolution. Our next steps include:
- Spatial Paint-to-Video: Allowing users to "draw" motion paths on storyboard frames to trigger Gemini's video generation capabilities.
- Collaborative World-Building: A shared "Writer's Room" where teams can contribute to a single, persistent World Bible.
- Dynamic Soundtrack Generation: Integrating AI music generation to match the emotional arc of the "Director's Brief" automatically.

Log in or sign up for Devpost to join the conversation.