Inspiration

More than half of Americans reported feeling lonely this year. Moreso, everyone needs someone to talk to, yet finding a listening ear when you need it most isn’t always easy. It’s difficult to find someone who truly listens, remembers, and cares, no matter the time of day or topic.

What it does

Akira is your true companion, a fully conversational ai powered by a multi-agent memory architecture you can personalize both visually and functionally.

Key features include:

  • Memory-Powered: Learns and remembers what matters.

  • Multi-Modal: fully conversational, analyzes photos, and fetches live web data on demand.

  • Emotion-Aware: Adapts itself to how you feel.

  • Friendly UI: Engaging avatar you can style and interact with.

How we built it

Backend

  • Memory Agent
    Deployed a two-agent architecture on Letta Cloud:

    • Low-Latency Agent uses a lightweight model to deliver instant responses.
    • Context Agent leverages a more powerful model to retrieve and feed relevant context to the low-latency agent.
  • Multi-Modal Pipeline

    • Voice: Vapi Cloud’s STT → custom Letta LLM → TTS workflow for seamless spoken conversations.
    • Images: Next.js API routes send uploads to Claude-sonnet-4 for image-to-text processing, then inject captions into the Vapi conversation stream.
    • Web Search: Letta’s tool use fetches live web data on demand and augments the dialogue.
  • Emotion-Aware Dialogue
    Tuned Letta agents and Vapi configurations to produce natural, empathetic flow—mirroring real human engagement.

  • Auth & Data
    Secure user authentication and database management ensure controlled access, encrypted storage, and efficient handling of conversation and profile data.

Frontend

  • Framework & Styling
    Built with Next.js, TypeScript, and Tailwind CSS for a fast, responsive, and modern UI.

  • 3D Avatar & Customization
    Three.js renders an interactive avatar you can style—change cosmetics, expressions, and animations—and engage with through gestures and reactions in real time.

Challenges we ran into

  • Emotion & Speech Tuning
    Fine-tuning the Letta memory agents and Vapi’s pipeline to produce natural, empathetic speech took extensive iteration.

  • Integration Complexity
    Orchestrating Vapi Cloud and Letta Cloud for seamless multimodal functionality (voice, text, image, web search) required careful design and execution across multiple services and frameworks.

Accomplishments that we're proud of

  • Deployed a fully functional multimodal, memory-enabled AI companion in under 24 hours.
  • Seamlessly combined Vapi Cloud’s STT→Letta LLM→TTS pipeline with Letta Cloud’s two-agent memory architecture for real-time, context-rich conversations.
  • Built a friendly, interactive UI and a fully customizable 3D avatar using Three.js, Next.js, TypeScript, and Tailwind CSS—bringing Akira to life.

What we learned

  • The power of a two-agent memory system for smooth, engaging conversations by balancing low-latency responses with deep contextual recall.
  • Orchestrating Vapi Cloud’s STT→LLM→TTS pipeline alongside Letta Cloud’s memory agents revealed best practices for multi-cloud service integration and error handling.
  • Iteratively tuning emotion and speech parameters underscored how small prosody adjustments can dramatically improve perceived empathy and naturalness.
  • Building multi-modal support (voice, text, image, web search) showed us the importance of designing flexible data flows and fallbacks for each modality.
  • Empowering users with avatar customization highlighted how personalization drives deeper connection and sustained engagement.

What's next for akira

  • Multi-Language Support
    Expand beyond English to enable truly global companionship in the user’s native language or dialect.

  • Voice-Call Integration
    Allow Akira to “call in” and check on you via phone or integrate directly with VoIP for hands-free conversations.

  • Long-Term Memory Enhancements
    Introduce “memory pruning” and “highlight reels” so Akira can surface your most important moments and learn over months or years.

  • Third-Party Integrations
    Plug into calendars, music services (Spotify, Apple Music), fitness trackers, and smart-home devices to make Akira an even more useful companion.

  • More Avatar Configuration
    Offer finer-grained customization options—hair styles, outfits, expressions, and dynamic animations—so users can craft a unique companion.

  • Immersive Three.js World
    Build a virtual environment where your avatar can explore, interact, and host mini-experiences, turning chats into immersive encounters.

Built With

Share this project:

Updates