Oratio: Enterprise-Grade Conversational AI, Simplified

Inspiration

In the enterprise world, conversational AI promises to revolutionize customer service, automate complex workflows, and unlock new efficiencies. However, deploying it is incredibly challenging. The process is fragmented and complex; technical teams get bogged down juggling separate Speech-to-Text (STT) and Text-to-Speech (TTS) models, managing conversation state, attaching tools, and ensuring memory persistence at scale.

This complexity creates a major roadblock, preventing powerful business ideas from ever finding a voice. We were inspired to solve this problem by creating Oratio, a platform that abstracts away the friction and empowers enterprises to build and deploy sophisticated voice agents with ease.

What it does

Oratio is a seamless "agent-in-a-box" platform designed to simplify the entire lifecycle of creating and deploying enterprise-grade conversational AI. It provides a simple WebSocket and a Chat API, secured with API keys, allowing businesses to integrate a powerful voice agent into any application in minutes. The platform handles the complex backend, allowing companies to focus on the strategic value of what their agent does, not the low-level details of how it speaks and thinks. At its core, Oratio allows you to design, generate, and orchestrate specialized voice agents for any business need.

How we built it

We stood on the shoulders of giants, integrating a powerful, modern tech stack to build a multi-layered agentic system:

Nova Sonic: This was a game-changer for us, providing a single, fluid, and low-latency channel for voice interaction instead of separate STT and TTS models. This makes the agent feel incredibly responsive and natural.

BedrockAgentCore & Strands Agents: This combination formed the reliable "brain" of our agents, providing a robust framework for managing memory, session state, and the execution of complex tasks with external tools.

The "Agent Creator": To truly accelerate development, we built a MetaAgent—an AI that builds other AIs. Using a sophisticated LangGraph pipeline powered by dspy, it takes high-level inputs like an SOP document and automatically plans, generates, and reviews the Python code and system prompts for a new, ready-to-deploy voice agent.

The "Chameleon Agent": This is our dynamic orchestrator. It's a universal runtime that can load and execute any agent generated by our platform. This Chameleon Agent is exposed as a single, powerful tool to our main voice agent, allowing it to seamlessly switch between different capabilities on the fly.

Challenges we ran into

Every hackathon is a sprint, and ours was no exception. Our primary challenge was architecting this multi-agent system from the ground up. Designing a reliable LangGraph process for the Agent Creator that could intelligently draft, review, and generate correct code from abstract inputs required meticulous planning and debugging. Furthermore, developing the Chameleon Agent to act as a stable, dynamic tool for the main voice agent—capable of loading and running entirely new agents on command—added another layer of complexity to ensure seamless orchestration.

Accomplishments that we're proud of

We are incredibly proud of architecting and building a complete, multi-agent ecosystem in such a short time. Our key accomplishments include:

  • The MetaAgent: Successfully creating an autonomous "AI to build AI" that dramatically shortens the development cycle for new voice agents.
  • The Chameleon Agent: Building a dynamic orchestrator that makes our platform incredibly flexible and extensible, allowing it to adapt to new tasks without redeployment.
  • An End-to-End Platform: Delivering a fully-integrated solution, from high-level design input to a live, interactive voice agent, all through a simple API.

What we learned

This project was an incredible learning experience. We gained a deep appreciation for how much integrated voice models improve the end-user experience. More importantly, we solidified our belief that the future isn't just about single agents, but about creating ecosystems of specialized agents that work together. The layered architecture of a creator agent, an executor agent, and a user-facing agent proved to be an incredibly powerful and scalable model for enterprise use cases.

What's next for Oratio

We're excited about the future of Oratio and see a clear path forward. Our next steps include:

  • Bring Your Own Tools: Enable enterprises to securely connect their own custom tools. This will include support for bringing AWS Lambdas as functions or importing OpenAPI specifications, allowing voice agents to directly and securely interact with internal databases and execute proprietary business logic.
  • UI for Agent Creation: Develop a user-friendly web interface for the Agent Creator, allowing business analysts and project managers to configure and deploy new agents without writing code.
  • Live Transcripts & Human Handoff: Implement a live transcript view for all WebSocket interactions, allowing supervisors to monitor conversations in real-time. This will enable seamless handoffs and escalations to human agents through integrations with platforms like Microsoft Teams and Slack.

Built With

Share this project:

Updates