Inspiration

Game developers, writers, and studios need diverse, realistic character personas—but creating voiced characters is expensive and time-consuming. Hiring voice actors costs $500-2000 per character. We built Persona Engine to solve this: generate thousands of unique, fully-voiced AI characters in seconds.

What it does

Persona Engine creates complete character profiles with custom AI voices using ElevenLabs:

  • AI-Powered Voice Generation: Each persona gets a unique voice generated via ElevenLabs Voice Design API
  • Intelligent Character Creation: GPT-4 generates personality traits, backstories, relationships, and attributes
  • Real-Time Voice Dialogue: Characters speak naturally using ElevenLabs Text-to-Speech
  • Multi-Language Support: Generate voices in 32 languages via ElevenLabs
  • Enterprise API: Seamless integration for game studios and creative tools

How we built it

Voice Pipeline:

  • ElevenLabs Voice Design API for custom voice generation from personality descriptions
  • ElevenLabs TTS API (Turbo v2.5) for low-latency real-time dialogue
  • Google Cloud Vertex AI for personality analysis
  • OpenAI GPT-4 for character profile generation
  • Firebase for real-time collaboration
  • React frontend + Node.js backend

Voice-to-Persona Pipeline Architecture:

User Input → Gemini Pro (Personality Analysis)
    ↓
  Structured Persona Profile
    ↓
ElevenLabs Voice Design API
    ↓
  Custom Voice Model (age/gender/accent/style)
    ↓
11.ai Text-to-Speech API
    ↓
  Real-Time Voice Synthesis
    ↓
Firebase Storage + CDN Delivery

Gemini Pro Integration:

  • Analyzes personality descriptions using Gemini Pro
  • Generates character profiles with traits, backstories, and speech patterns
  • Creates context-aware dialogue that matches persona characteristics
  • Optimizes personality consistency across interactions

ElevenLabs Voice Pipeline:

  • Voice Design API generates unique voice parameters from persona traits
  • Maps personality dimensions to voice characteristics (pitch, tempo, timbre)
  • Creates 20,000+ distinct voices using ElevenLabs' generation capabilities
  • 11.ai TTS API converts text to speech with personality-matched voices
  • Sub-300ms latency for real-time dialogue

Production Infrastructure:

  • Google Cloud Functions for serverless voice generation
  • Vertex AI Gemini endpoints for personality analysis
  • Firebase for data persistence and CDN delivery
  • Service account authentication for secure API access

Accomplishments

  • Generate 20,000+ unique voiced personas per month
  • Voice generation latency under 300ms (ElevenLabs Turbo)
  • Production-ready API for game studios
  • $0.05 per character vs $500-2000 traditional cost
  • 32-language voice support via ElevenLabs

What's next

  • Voice-to-persona pipeline: Speak for 30 seconds → Get matching AI character
  • Emotional voice modulation based on persona traits
  • VR integration with spatial audio

Built With

Share this project:

Updates

posted an update

Voice-to-Persona Pipeline: Technical Architecture Complete

Infrastructure completed for breakthrough feature showcasing Google Cloud + ElevenLabs integration.

5-Stage Pipeline:

  1. Speech-to-Text (Google Cloud)
  2. Gemini Pro personality analysis
  3. Persona Engine character generation
  4. ElevenLabs Voice Design API
  5. ElevenLabs TTS Turbo v2.5

Infrastructure: Service Account persona-engine-vertex-ai@vernal-tracer-474221-b1 | Vertex AI Active | ElevenLabs 10k credits ready

Impact: $500-2000 -> $0.05 per character (99% reduction) | 2-4 weeks -> 30 seconds

Market: $5B character design (game studios, VR/AR)

Demo: https://www.youtube.com/watch?v=0lDXsFn4xUU

Competing for $12,500 - AI Partner Catalyst ElevenLabs Challenge

Log in or sign up for Devpost to join the conversation.