Inspiration
Game developers, writers, and studios need diverse, realistic character personas—but creating voiced characters is expensive and time-consuming. Hiring voice actors costs $500-2000 per character. We built Persona Engine to solve this: generate thousands of unique, fully-voiced AI characters in seconds.
What it does
Persona Engine creates complete character profiles with custom AI voices using ElevenLabs:
- AI-Powered Voice Generation: Each persona gets a unique voice generated via ElevenLabs Voice Design API
- Intelligent Character Creation: GPT-4 generates personality traits, backstories, relationships, and attributes
- Real-Time Voice Dialogue: Characters speak naturally using ElevenLabs Text-to-Speech
- Multi-Language Support: Generate voices in 32 languages via ElevenLabs
- Enterprise API: Seamless integration for game studios and creative tools
How we built it
Voice Pipeline:
- ElevenLabs Voice Design API for custom voice generation from personality descriptions
- ElevenLabs TTS API (Turbo v2.5) for low-latency real-time dialogue
- Google Cloud Vertex AI for personality analysis
- OpenAI GPT-4 for character profile generation
- Firebase for real-time collaboration
- React frontend + Node.js backend
Voice-to-Persona Pipeline Architecture:
User Input → Gemini Pro (Personality Analysis)
↓
Structured Persona Profile
↓
ElevenLabs Voice Design API
↓
Custom Voice Model (age/gender/accent/style)
↓
11.ai Text-to-Speech API
↓
Real-Time Voice Synthesis
↓
Firebase Storage + CDN Delivery
Gemini Pro Integration:
- Analyzes personality descriptions using Gemini Pro
- Generates character profiles with traits, backstories, and speech patterns
- Creates context-aware dialogue that matches persona characteristics
- Optimizes personality consistency across interactions
ElevenLabs Voice Pipeline:
- Voice Design API generates unique voice parameters from persona traits
- Maps personality dimensions to voice characteristics (pitch, tempo, timbre)
- Creates 20,000+ distinct voices using ElevenLabs' generation capabilities
- 11.ai TTS API converts text to speech with personality-matched voices
- Sub-300ms latency for real-time dialogue
Production Infrastructure:
- Google Cloud Functions for serverless voice generation
- Vertex AI Gemini endpoints for personality analysis
- Firebase for data persistence and CDN delivery
- Service account authentication for secure API access
Accomplishments
- Generate 20,000+ unique voiced personas per month
- Voice generation latency under 300ms (ElevenLabs Turbo)
- Production-ready API for game studios
- $0.05 per character vs $500-2000 traditional cost
- 32-language voice support via ElevenLabs
What's next
- Voice-to-persona pipeline: Speak for 30 seconds → Get matching AI character
- Emotional voice modulation based on persona traits
- VR integration with spatial audio
Log in or sign up for Devpost to join the conversation.