Inspiration
In the new age of AI, the tools for visual storytelling are exploding, but they often come with a catch: they require expensive cloud GPUs, compromise user privacy by uploading personal photos, and struggle to maintain character consistency. Our inspiration was to solve these problems.
We envisioned Storynia, a tool that would empower anyone to create rich, personalized visual stories with consistent characters, right from the security and privacy of their own computer. We wanted to make creative ownership and local-first AI a reality for everyone.
What it does
Storynia is a desktop app that turns you into an AI-assisted visual storyteller. You can generate characters for stories or bring your own (eg. you, your friends or your pets).
Then continue the story, imagining anything you like, and story will unfold accordingly.
The LLM writes the story with image prompts, and diffusion model on your device generates and edits the images.
How we built it
Storynia is built on a powerful hybrid architecture that leverages the best of local and cloud technologies for a seamless user experience.
Framework: We used
Tauri(Rust backend + React frontend) to create a lightweight, native, cross-platform application.Local Image Generation: For image generation, we use
Flux-Kontext-Devviastable-diffusion.cppallowing for efficient and fast inference on consumer-grade hardware via CUDA.Foreign Function Interface: The cornerstone of our project is a custom-built Rust FFI layer. We used C ABI to create a library that allows our Tauri's Rust backend to communicate directly and safely with the C++ Diffusion engine.
LLM: We chose
gpt-oss-120b(which is a 5b MoE model) via Groq for its incredible speed, providing near-instantaneous story and prompt generation to keep the user in a creative flow.
Challenges we ran into
The biggest challenge was undoubtedly building the Rust-C++ FFI layer from scratch. Bridging two different memory models, handling data type conversions (like passing image data and text strings), and compiling it all into a stable library that could be bundled with Tauri was a significant undertaking.
We also spent considerable time finetuning the LLM prompts to ensure they could generate descriptive and consistent visual details for stable-diffusion.cpp to work effectively. For example, we wanted it to change poses dynamically and the environments according to the story so we extensively followed the Kontext's Prompting Guide from Black Forest Labs and integrated it into the LLM context as efficiently as possible.
Accomplishments that we're proud of
We are incredibly proud of successfully creating a robust FFI layer, which is the technical heart of this project. This enabled us to achieve our primary goal: true local-first image generation.
Seeing a character-consistent story unfold, with images being created entirely on-device, felt really good.
We're also proud that Storynia can run effectively on consumer-grade hardware, making this powerful creative tool accessible to a much wider audience than those with very high end GPUs.
What we learned
This hackathon was a deep dive into the power of hybrid AI architectures. We learned that by intelligently delegating tasks like fast text generation to the cloud, sensitive and heavy image generation to the local device, we can create applications that are both powerful and private.
We gained invaluable hands-on experience in low-level systems programming by building the FFI, and we learned just how crucial prompt engineering is to harnessing the power of generative models.
What's next for Storynia
The future for Storynia is bright. Our immediate plans are to enhance the creative experience by adding features like:
- Exporting stories to PDF
- Integrating Text-to-Speech (TTS) for audio narrations
- Adding multilingual support.
Built With
- cpp
- javascript
- react
- rust
- tauri
Log in or sign up for Devpost to join the conversation.