Framework

NovelForge AI

A Gemini 3–powered multi-agent system that turns a single sentence into a consistent long-form story.

Input: one-line idea (+ optional constraints)
Output: Series Bible → Characters → Outline.jsonl → Slice plan → Multi-chapter drafts → Compressed memory (for consistency)

TL;DR

Long-form storytelling is not just “write better text” — it’s a structure + memory problem.
We solve it using:

Multi-agent specialization (like a real writing team)
Hierarchical generation (chapter → slice → prose)
Compressed memory injection (no embeddings / no vector DB)
Gemini 3 for planning, structured outputs, drafting, and memory compression

About the Project

What inspired us

Most LLM writing demos end at a short story, because once narratives become long, they drift:

characters “change personality”
canon facts contradict
pacing collapses across chapters

We wanted a simple experience:

Type one sentence. Get a structured, multi-chapter story that stays consistent.

What we built

A workflow-orchestrated, Gemini 3 multi-agent pipeline that generates:

a Series Bible (world rules + plot beats)
a Character System
a chapter outline (outline.jsonl)
a slice plan (scenes per chapter)
prose drafted slice-by-slice
and memory compressed after each chapter and injected into the next prompts

Why this matters (Market + Technical Gap)

Market opportunity

The books / publishing ecosystem is already massive:

IBISWorld estimates global book publishing industry revenue at ~$126.9B in 2025:
Global Book Publishing Industry Analysis, 2025 - IBISWorld
Grand View Research estimates the global books market at ~$150.99B in 2024, with continued growth projections:
Books Market Size, Share & Growth | Industry Report

Even though these are overall book market numbers (not “novels only”), long-form storytelling remains a high-value content segment—especially as serialized digital fiction and downstream IP expand.

The technical gap: long-form writing doesn’t fit into one model call

A standard novel commonly falls around 70k–100k words (genre dependent).
How Many Words in a Novel? (Reedsy)

Even with long-context models, “single-shot novel generation” is constrained by:

limited controllability,
output limits,
and the inability to preserve canon over many chapters without memory.

Public summaries for Gemini 3 Pro preview commonly cite up to ~64k output tokens and a very large input context window.
Gemini 3 Pro (Preview) - LLM Stats

So the real challenge becomes:

How do we scale storytelling beyond a single prompt while maintaining structure and long-term consistency?

Solution Overview

We scale beyond one context window by combining:

specialized agents (planning, outlining, slicing, drafting)
hierarchical decomposition (chapters → slices)
compressed memory injection (summarize canon + recap after each chapter)

This makes long-form generation structured, stable, and explainable.

Architecture (High-Level)

Client → Orchestrator / Job Runner → Agents → Artifacts + Memory

The Orchestrator owns the workflow:
- runs the chapter loop + slice loop
- assembles prompts
- injects compressed memory
- validates outputs
- persists artifacts
Agents stay role-focused (no messy cross-calls)
Memory is updated per chapter and injected into next steps

Agent Design (Role-Based Multi-Agent Pipeline)

Architect — Series Bible Builder

Turns the one-line idea into a Series Bible: world rules, plot beats, themes, stakes, and long-range structure.
Goal: make the story writeable for many chapters.

Profiler — Character System Designer

Builds a coherent cast: motivations, fears, goals, arcs, and relationship graph.
Goal: create characters that generate conflict naturally.

Weaver — Chapter Outline Generator

Creates the executable chapter plan: chapter summaries, turning points, pacing roles, and cliffhangers.
Goal: structure long-form narrative so drafting doesn’t drift.

Slicer — Chapter-to-Slice Decomposer

Breaks each chapter into 3–4 “slices” (scenes) with explicit objectives, conflicts, and transitions.
Goal: keep generation manageable and controllable.

Writer — Slice-Following Draft Writer

Drafts prose slice-by-slice, preserving tone and canon, and lands the chapter hook.
Goal: produce readable text while respecting the plan.

Memory System (Compressed Summaries, No Embeddings)

We intentionally do not use embeddings or a vector database. Instead, after each chapter, we compress the chapter into:

Short-term memory: current arc, recent events, immediate continuity constraints
Long-term memory (canon): stable facts (world rules, character truths, timeline constraints)

Then, for the next chapter/slice, the orchestrator:

loads memory summaries
injects them into the prompt context
continues generation with continuity constraints

This yields long-form consistency without retrieval complexity.

Why Gemini 3

Gemini 3 is not just “a good writer model” for us — it is well-suited for a workflow that repeatedly switches modes:

Strong planning + constraint-following Agents like Architect/Weaver require structured reasoning, not just fluent prose.
More reliable structured outputs Our pipeline depends on JSON/JSONL artifacts; format reliability directly affects system stability. Public summaries highlight Gemini 3 Pro’s large context and high output capacity, supporting long-form pipelines.
Gemini 3 Pro (Preview) - LLM Stats
Good summarization for memory compression Memory compression needs precise canon extraction + short recap. Gemini 3 performs well at controlled summarization.

In short:

Gemini 3 makes multi-agent orchestration practical because it handles planning, structure, drafting, and compression reliably in one system.

What We Learned

Multi-agent systems are software architecture, not “more prompts”.
Memory is the real bottleneck for long-form LLM products.
Compression + injection is a strong baseline before adding retrieval/embeddings.

References

Built With

google-ai-studio

Updates

Huanyi Xie started this project — Feb 05, 2026 02:31 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.