Inspiration
I am interested in Google Cloud and AI technologies. I wanted to learn ElevenLabs vocal solution, hence I have decided to showcase the Integration of ElevenLabs API with Google Cloud solutions
What it does
Enabling End User to listen AI agent response after automatic conversion into audio, through a User Interface offering to customize voice character and setting
How we built it
Full Stack Python Application including :
backend AI Agent powered by Google ADK and leveraging also Google Search Tool in addition to Weather / Time basic functions
frontend Streamlit User Interface to customize Vocal Character/Settings and interact with AI Agent
Challenges we ran into
Audio instant Real time Play works only on local host but not from remote service. Hence audio agent responses are proposed to end users as links and launched manually by their click
Ngrok for temporary tunneling to access the backend service since Eleven Labs API return failure on attempts from various sources requests with recommendation to upgrade to paid subscription => next step scheduled action
Accomplishments that we're proud of
a Complete Full Stack Application (Frontend and Backend) enabling the end user to get better accessibility (audio synthesis) with AI Agent interaction and Possibility to customize the Voice Character (+ Vocal Settings later on).
The solution has been developed using Google Cloud Shell and deployed with Google Cloud Run
What we learned
ElevenLabs API (Text To Search, Voice Settings,...) integration
Activation of more and more Elevenlabs Vocal Settings
What's next for MySupportAgent
Activate Paid Subscription for ElevenLabs API usage (Free Plan limited for local and personal Development)
Introduce Google Cloud Storage Bucket to store audio files for listening
Log in or sign up for Devpost to join the conversation.