Project Report: Chain of Consciousness: Collaborative Storytelling and Image Generation
1. Inspiration
We wanted to create a collective storytelling experience that captures how a single line or choice can ripple throughout an entire narrative. The idea was to blend collaborative input, LLM-based text generation, and AI-powered artwork to visualize how stories can evolve in unexpected directions. Our main inspirations were:
- “Exquisite Corpse”–style creative writing games, where multiple participants build a story line by line.
- The chain-of-events concept, illustrating how each small narrative step can influence the entire story arc.
- The emerging possibilities of generative AI (both text and image) and how they can enhance creativity in a community setting.
2. What We Learned
Prompt Engineering
- We discovered the importance of clear, focused prompts for Large Language Models (LLMs), ensuring they return valid JSON or structured outputs.
- Handling edge cases where the LLM might produce code fences or unexpected text required thoughtful post-processing.
- We discovered the importance of clear, focused prompts for Large Language Models (LLMs), ensuring they return valid JSON or structured outputs.
Image Generation & API Handling
- Experimented with DALL·E 3 and learned how to craft effective image prompts—merging story metadata (genre, tone, style) with focal characters and settings.
- Explored how to save images locally from an AI image API response.
- Experimented with DALL·E 3 and learned how to craft effective image prompts—merging story metadata (genre, tone, style) with focal characters and settings.
Designing an Evaluation & Ranking System
- Implemented a method to score stories on dimensions like plot cohesion, creativity, characters, etc., mapping a final numeric total to ranks (E → SS).
3. How We Built the Project
Architecture Overview
- Discord/CLI: Users provide story lines or prompt the AI to generate them.
- LLM Integration (OpenAI GPT): For generating next lines, extracting metadata (new characters, settings), and summarizing stories.
- In-Memory Data Storage: A global dictionary that stores each story by
storyId. Each story object includes metadata, lines, characters, settings, etc. - Finalization & DB Hook: Once a story finishes, we “finalize” it—applying a final AI-driven metadata pass—then remove it from memory (with an option to save in a database).
- Discord/CLI: Users provide story lines or prompt the AI to generate them.
Workflow
- create_story: Initializes the JSON structure and requests initial metadata from the LLM.
- add_new_line_and_update: Appends new lines (user or AI), triggers character/setting extraction, updates summaries.
- finalize_story: Uses the LLM to propose final metadata (title, genre, etc.), then moves the story out of memory.
- create_story: Initializes the JSON structure and requests initial metadata from the LLM.
Image Generation
- A build_dalle_prompt function dynamically constructs a DALL·E 3 prompt from the story’s metadata, characters, and setting.
- We then download and save the resulting image to disk using Python’s
requests.
- A build_dalle_prompt function dynamically constructs a DALL·E 3 prompt from the story’s metadata, characters, and setting.
4. Challenges Faced
Ensuring Valid JSON from the LLM
- The model sometimes returned code blocks or extra commentary. We overcame this by:
- Prompting: “Respond only in valid JSON.”
- Regex post-processing to strip code fences if needed.
- Prompting: “Respond only in valid JSON.”
- The model sometimes returned code blocks or extra commentary. We overcame this by:
Maintaining Story Consistency
- Sometimes the AI would introduce contradictory details (like reusing character names differently).
- We mitigated this by storing structured data (e.g., characters array) and letting the LLM refer back to the existing “story so far.”
- Sometimes the AI would introduce contradictory details (like reusing character names differently).
AI Prompts for Images
- Generating cohesive visuals that truly match the story required iterative experimentation with DALL·E prompts—balancing detail with creative freedom.
- We learned that specifying tone (dark, whimsical), color palettes, and key visuals improved the results.
- Generating cohesive visuals that truly match the story required iterative experimentation with DALL·E prompts—balancing detail with creative freedom.
Overall, this project showed us how to blend user creativity with AI assistance—from line-by-line generation to a final illustrated product. We hope to expand it further, possibly adding more sophisticated editing tools or branching narratives in the future!
Log in or sign up for Devpost to join the conversation.