Call of Cthulhu: The Unseen Narrator
Speak your move. The story answers back.
A great Dungeon Master (DM) does far more than narrate. They listen closely, enforce rules fairly, pace tension, protect player agency, and keep the world coherent even when players do the unexpected. That craft takes years to learn and many hours to run well.
Call of Cthulhu: The Unseen Narrator was built to respect that craft, not replace it with generic text generation.
This project began with a simple but difficult question:
Can an AI system run a tabletop horror session with the structure, consistency, and responsiveness players expect from a real DM?
Our answer is a layered game-mastering system that combines narrative generation with strict control logic, state tracking, rules grounding, combat resolution, and voice interaction. This is not a single chatbot prompt. It is a full DM pipeline.
Why We Built This
Tabletop communities care deeply about immersion, fairness, and continuity. In horror RPGs especially, pacing and consequence matter. Most AI storytelling demos can generate prose, but they often break the game loop: impossible actions pass, outcomes are declared without risk, and narrative jumps ignore scene progression.
We wanted to build something that behaves like a Keeper:
- Validate what players attempt
- Decide when dice are actually needed
- Resolve uncertainty with consequences
- Move the story through a structured state graph
- Preserve momentum without railroading players
- Deliver the experience through voice, narration, music, and scene handouts
We also interviewed experienced Dungeon Masters and incorporated their feedback on pacing, tension escalation, player agency, and failure-forward storytelling. The goal was to stay close to the tabletop community while proving that imaginative, entertaining AI play can still be rule-aware and coherent.
What We Built
The system orchestrates an end-to-end game loop:
- Player actions are collected via text or live voice input.
- A safety and feasibility layer validates actions.
- A roll-decision layer determines whether a roll is required.
- Dice outcomes are resolved and attached to player context.
- The DM engine generates per-player narrative outcomes in structured JSON.
- Output is validated against the state-action graph.
- Combat can branch into a dedicated combat resolver.
- Narration, ambient audio, and visual handouts are delivered to players.
- Session state is persisted and exposed through backend APIs.
This architecture is deliberately modular. Each layer has one responsibility and can be tested independently.
Core Architecture
1) Meta Controller (Game Logic Guardrails)
The meta-controller is responsible for game integrity before narrative generation:
- Rejects physically impossible human actions.
- Blocks game-hijacking attempts or declared outcomes.
- Rewrites guaranteed-success phrasing into attempt-based actions.
- Uses context from scene and DM suggestions to judge feasibility.
2) Roll Intelligence (Uncertainty with Consequence)
Not every action should require a roll. The roll module applies a conservative rule:
- Default to no roll.
- Roll only when outcome is uncertain and meaningful.
- Choose skill or characteristic grounded in actual investigator data.
- Deny actions requiring skills a character does not possess.
3) State-Action Graph (Narrative Coherence Engine)
The story world is represented as graph states and transitions:
- Players progress through explicit world states.
- Suggested state changes are validated for reachability.
- Transition jumps are constrained (including two-hop validation logic).
- Terminal states are enforced so endings behave like real endings.
4) Dungeon Master Engine (Structured Output)
The DM module receives current state context, character sheets, recent rolls, and rulebook context. It returns strict JSON per player, including:
- Narrative outcome
- Suggested actions & next nodes
- Clues, NPCs, and environmental details
- HP/sanity deltas
- Music and handout recommendations
5) Combat Mode (Specialized Resolver)
Combat is handled by a dedicated engine, not ad hoc prose:
- Tracks round-by-round enemy state and HP.
- Validates combat actions.
- Runs roll checks and deterministic damage rails.
- Cleanly hands combat summary back to narrative mode.
6) Voice, Narration, and Session Delivery
- Streaming speech input with end-of-turn detection.
- TTS narration for generated outcomes.
- Ambient music serving and visual handout support.
- Flask API endpoints for setup, validation, and session state.
Immersion Through Clue Handouts and Ambient Music
One of the most important lessons from tabletop play is that imagination gets stronger when players are given sensory anchors. In our system, narrative is not delivered as text alone. The DM can select scene-appropriate clue handouts and ambient music as part of each game-state update, so players feel the world rather than just reading about it.
- Clue handouts (letters, journals, maps, creature visuals) are revealed when exploration reaches the right moment, giving players tangible evidence to inspect, discuss, and theorize about.
- Ambient music selection is tied to scene transitions and mood changes, helping sustain dread, uncertainty, relief, or escalation without distracting from player agency.
- The system avoids random media switching; it keeps continuity and changes tracks only when the narrative beat truly shifts.
This matters because Call of Cthulhu is an imagination-first game. The fear is not only in what is shown, but in what players infer, suspect, and construct together at the table. Handouts and sound design create shared mental imagery, and that shared imagery is what turns a session from “AI output” into a living roleplaying experience.
Why Imagination Is Central to This Project
Tabletop horror works when players co-create meaning: one clue suggests a hidden history, one sound implies danger behind a door, one description makes everyone picture something different but equally unsettling. Our project is built to support exactly that loop. The AI does not aim to replace imagination; it aims to activate it by combining structured storytelling, player-driven choices, and immersive narrative artifacts that keep the world vivid and believable.
Engineering Effort and Quality
This project represents significant engineering beyond prompt design:
- Full pipeline orchestration across multiple subsystems.
- Validation and regeneration constraints for narrative control.
- Dedicated combat runtime with roll-grounded outcomes.
- Backend API design for product integration.
- Extensive integration tests around transitions and full-flow execution.
What Makes This Different
Most AI game demos are text generators. The Unseen Narrator is a game-mastering system.
| Feature | The Unseen Narrator | Standard AI Chat |
|---|---|---|
| Rules | Enforces Mechanics | Ignores/Hallucinates |
| State | Persistent Tracking | Short-term Memory |
| Outcomes | Fail-forward / Dice-based | Arbitrary Success |
| Media | Integrated Voice/Music | Text Only |
In short: this is not “chat with a fantasy bot.” It is an attempt to model the real workload of a Dungeon Master in software, while honoring the tabletop experience.
Vision
Our long-term vision is to make high-quality, story-rich tabletop experiences more accessible while staying faithful to the roleplaying community. We believe AI should amplify imagination, not flatten it.
Call of Cthulhu: The Unseen Narrator is our step toward that vision: a structured, community-informed, voice-ready AI Keeper built for immersion, consequence, and entertainment.
Log in or sign up for Devpost to join the conversation.