Midnight at the Voss Manor - Hackathon Submission
🎭 The Inspiration
What if ghosts could argue with each other about your choices? What if AI agents had distinct personalities and debated morality in real-time?
Midnight at the Voss Manor was born from a simple question: Can we create a narrative experience where multiple AI agents feel like real characters with conflicting perspectives?
We wanted to push beyond single-agent chatbots and create a Frankenstein's monster of technologies - stitching together LLMs, text-to-speech, blockchain verification, and procedural storytelling into something hauntingly alive.
🏗️ How We Built It
This project is a chimera of seemingly incompatible technologies:
The Tech Stack
- 5 AI Ghost Agents powered by Groq's Llama 3.3 70B - each with distinct personalities (maternal Elara, amnesiac scientist Harlan, innocent child Mira, regretful Theo, cold Selene)
- Multi-Provider TTS Orchestra - Azure, Google, Play.ht, and Gemini for unique voice acting
- Model Context Protocol (MCP) servers for:
- Blockchain vow verification (checking eternal promises)
- Memory persistence across scenes
- Sentiment analysis of player choices
- AI-generated imagery with Gemini
- Next.js 14 with React for the interactive frontend
- Suno AI for atmospheric background music
- Gemini Nano Banano Pro for gothic-cyberpunk scene generation
The Architecture
// Each ghost has a unique personality system
const ghostPersonalities = {
elara: "Maternal, gentle, focuses on family harmony",
harlan: "Scientific, amnesiac, logical but emotionally confused",
mira: "Childlike, innocent, wants play and attention",
theo: "Dramatic, regretful, seeks redemption",
selene: "Cold but softening, demands truth"
}
The ghost debate system is the heart of the experience. When you make a choice, all 5 agents independently generate responses, then reach consensus:
// Real-time multi-agent debate
const debate = await fetch('/api/ghost-debate', {
method: 'POST',
body: JSON.stringify({
puzzleContext: currentScene,
playerMessage: playerChoice
})
})
💀 Challenges We Faced
1. Groq API Integration & Rate Limiting
Groq's Llama 3.3 70B is incredibly fast, but coordinating 5 simultaneous agent calls was tricky:
- Challenge: Rate limits when all ghosts speak at once, API errors breaking the narrative flow
- Solution:
- Implemented sequential processing instead of parallel calls
- Added exponential backoff retry logic (3 attempts with 1s, 2s, 4s delays)
- Created fallback personality-specific responses for each ghost
- Used streaming responses to show "thinking" state
- Learning: Fast inference doesn't mean unlimited throughput - proper error handling is critical for production
// Sequential ghost debate with retry logic
async function callGroqWithRetry(prompt, retries = 3) {
for (let i = 0; i < retries; i++) {
try {
return await groq.chat.completions.create({
model: "llama-3.3-70b-versatile",
messages: [{ role: "system", content: prompt }]
})
} catch (error) {
if (i === retries - 1) return fallbackResponse
await new Promise(r => setTimeout(r, Math.pow(2, i) * 1000))
}
}
}
for (const ghost of ghosts) {
const response = await callGroqWithRetry(ghost.personality)
debate.push({ ghost: ghost.name, message: response })
}
2. MCP Server Development & Integration
Building custom Model Context Protocol servers was uncharted territory:
- Challenge:
- No existing examples for blockchain vow verification or game state management
- Understanding JSON-RPC 2.0 message format
- Debugging MCP communication between Kiro and custom servers
- Handling async operations properly
- Solution:
- Studied MCP specification thoroughly
- Created 4 custom MCP servers from scratch:
blockchain-vows-server.js- Eternal promise verification (4 vows stored)memory-server.js- Cross-scene state persistencesentiment-server.js- Player choice analysisimage-gen-server.js- Gemini integration for visuals- Built test harness in Kiro IDE to validate server responses
- Added extensive logging for debugging
- Learning: MCP is powerful but requires deep understanding of the protocol spec and async Node.js patterns. The ability to extend the IDE with custom tools during development was game-changing.
// Custom MCP server for vow verification
server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [{
name: "check_vow",
description: "Verify if a character kept their eternal vow",
inputSchema: {
type: "object",
properties: {
person: { type: "string", description: "Who made the vow" },
vow: { type: "string", description: "What was promised" }
},
required: ["person", "vow"]
}
}]
}))
// Handle the actual vow checking
server.setRequestHandler(CallToolRequestSchema, async (request) => {
const { person, vow } = request.params.arguments
const vowRecord = vowLedger.find(v => v.person === person && v.vow === vow)
return {
content: [{
type: "text",
text: vowRecord ? `${person}'s vow to "${vow}" was ${vowRecord.kept ? 'kept' : 'broken'}. ${vowRecord.reason}` : "No record found"
}]
}
})
3. Agent-Driven Development with Kiro IDE
This entire project was built using Kiro's AI agent - a meta experience of AI building AI:
- Challenge:
- Teaching the agent about game design, narrative structure, and multi-agent systems
- Maintaining context across multiple development sessions
- Getting the right balance of creativity vs. technical accuracy
- Communicating complex requirements clearly
- Solution:
- Created detailed steering rules in
.kiro/steering/ghost-agent-rules.mdandscene-structure.md - Used spec-driven workflow: requirements → design → tasks → implementation
- Leveraged MCP servers during development for real-time testing
- Iterated on prompts and provided clear acceptance criteria
- Used agent hooks to automate repetitive tasks
- Created detailed steering rules in
- Learning: Agent-driven development works best with:
- Clear specifications upfront (we used EARS format for requirements)
- Iterative refinement through conversation
- Custom steering rules for domain-specific knowledge
- Breaking complex features into small, testable tasks
- Treating the agent as a pair programmer, not a magic solution
4. The Audio Desync Problem
In production, audio would play before images loaded, breaking immersion:
- Challenge: Localhost worked fine, but Vercel deployment had 1-2 second delays between scene transitions
- Solution:
- Implemented image preloading for all 28 scene images on app start
- Added loading screen with progress tracking
- Synchronized audio playback with image load events
- Used
unoptimizedprop to bypass Next.js image optimization
const [imageLoaded, setImageLoaded] = useState(false)
useEffect(() => {
if (!imageLoaded) return
// Only play audio after image loads
audio.play()
}, [imageLoaded])
5. Ghost Personality Drift
Early versions had ghosts that sounded too similar despite different prompts:
- Challenge: Generic LLM responses lacking character voice - all ghosts converged to "helpful AI" tone
- Solution:
- Explicit personality constraints in system prompts with speech pattern examples
- Different TTS voices (Azure's Jenny, Guy, Aria, Davis, Sara)
- Character background context injected into every API call
- Temperature tuning per character (Mira: 0.9 for playfulness, Harlan: 0.6 for precision)
- Added "never" constraints: "Mira never uses complex vocabulary. Selene never speaks warmly."
const characterPrompts = {
mira: `You are Mira, a 7-year-old ghost. You don't understand death.
Speak in 1-2 short sentences. Use simple words. Be playful and innocent.
Example: "I like butterflies! Can we play?"`,
selene: `You are Selene, cold and betrayed. You speak formally and tersely.
You're softening but still guarded. Be direct, never warm.
Example: "Theo returned. But trust... that takes time."`
}
6. Multi-Provider TTS Orchestration
Coordinating 4 different TTS providers with different APIs, rate limits, and voice quality:
- Challenge:
- Azure has best quality but strict rate limits
- Google is reliable but less expressive
- Play.ht offers unique voices but slower
- Gemini is fast but lower quality
- Each provider has different API formats and error handling
- Solution:
- Created unified
speechServiceinterface abstracting provider differences - Implemented provider cascade: Azure → Google → Play.ht → Gemini → Browser TTS
- Added voice caching to reduce API calls
- Used browser TTS as final fallback for offline play
- Mapped each ghost to specific voices across providers
- Created unified
async function speak(text, character) {
const providers = [
{ name: 'Azure', voice: 'en-US-JennyNeural' },
{ name: 'Google', voice: 'en-US-Wavenet-F' },
{ name: 'PlayHT', voice: 'jennifer' },
{ name: 'Gemini', voice: 'en-US-Standard-A' }
]
for (const provider of providers) {
try {
return await provider.synthesize(text, voiceMap[character])
} catch (error) {
console.warn(`${provider.name} failed, trying next...`)
}
}
// Final fallback: browser TTS
return browserTTS.speak(text)
}
7. Debate Consensus Without Losing Drama
We wanted ghosts to disagree authentically but still reach meaningful conclusions:
- Challenge:
- Early attempts had ghosts agreeing too quickly (boring)
- Or arguing endlessly without resolution (frustrating)
- Forced consensus felt artificial and broke immersion
- Solution:
- Each ghost generates response independently (no shared context between them)
- Consensus is generated separately as a final step, acknowledging conflicts
- Player sees the full debate unfold, not just the conclusion
- Added "reflection" phase where ghosts can change their minds
- Consensus prompt explicitly asks to honor disagreements: "Acknowledge where the family disagrees, but find common ground"
- Learning: Authentic conflict requires isolation during generation, but meaningful resolution requires synthesis
// Generate independent responses
const debate = await Promise.all(
ghosts.map(ghost => generateResponse(ghost, context))
)
// Then synthesize consensus that honors disagreements
const consensus = await generateConsensus({
debate,
context,
instruction: "Acknowledge disagreements but find common ground"
})
// Result: "While Harlan argues for logic and Mira wants play,
// the family agrees that love transcends both..."
🧠 What We Learned
1. Multi-Agent Orchestration is Hard
Getting 5 AI personalities to feel distinct while maintaining narrative coherence required careful prompt engineering. Each agent needed:
- A unique voice (both literally via TTS and figuratively via personality)
- Consistent memory of past interactions
- The ability to disagree without derailing the story
2. TTS Provider Fallbacks are Essential
We implemented a cascade system across 4 TTS providers because:
- Azure has the best quality but rate limits
- Google is reliable but less expressive
- Play.ht offers unique voices
- Gemini provides a solid fallback
3. MCP is a Game-Changer
The Model Context Protocol let us extend Kiro IDE with custom tools during development:
- Testing blockchain vow verification without deploying
- Debugging agent memory persistence
- Analyzing sentiment in real-time
4. Image Preloading Matters
Production deployment revealed a 1-second delay between scenes. We solved it by:
- Preloading all 28 scene images on app start
- Adding a loading screen with progress tracking
- Using
unoptimizedimages for instant rendering
🎃 Why "Frankenstein" Category?
This project stitches together incompatible technologies into something unexpectedly powerful:
- 5 LLM agents arguing in real-time
- 4 TTS providers creating a voice orchestra
- Blockchain concepts (vow verification) in a narrative game
- MCP protocol extending the development environment
- AI-generated art & music creating atmosphere
- Next.js + React for interactive storytelling
Like Frankenstein's monster, it's made of disparate parts - but together, they create something alive.
🌟 What's Next?
- Branching narratives based on debate outcomes
- Player memory system that remembers choices across sessions
- More ghost interactions - what if they could possess objects?
- Multiplayer debates - multiple players influencing the ghosts
Built with Kiro IDE, powered by Groq, voiced by Azure/Google/Play.ht/Gemini, scored by Suno AI, and visualized by Gemini.
Built With
- ai
- azure-tts
- gemini
- google-tts
- groq
- kiro-ide
- llama-3.3
- model-context-protocol
- multi-agent
- next.js
- node.js
- play.ht
- react
- suno-ai
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.