Inspiration

I am interested in Google Cloud and AI technologies. I wanted to learn ElevenLabs vocal solution, hence I have decided to showcase the Integration of ElevenLabs API with Google Cloud solutions

What it does

Enabling End User to listen AI agent response after automatic conversion into audio, through a User Interface offering to customize voice character and setting

How we built it

Full Stack Python Application including :

  • backend AI Agent powered by Google ADK and leveraging also Google Search Tool in addition to Weather / Time basic functions

  • frontend Streamlit User Interface to customize Vocal Character/Settings and interact with AI Agent

Challenges we ran into

Audio instant Real time Play works only on local host but not from remote service. Hence audio agent responses are proposed to end users as links and launched manually by their click

Ngrok for temporary tunneling to access the backend service since Eleven Labs API return failure on attempts from various sources requests with recommendation to upgrade to paid subscription => next step scheduled action

Accomplishments that we're proud of

a Complete Full Stack Application (Frontend and Backend) enabling the end user to get better accessibility (audio synthesis) with AI Agent interaction and Possibility to customize the Voice Character (+ Vocal Settings later on).

The solution has been developed using Google Cloud Shell and deployed with Google Cloud Run

What we learned

  • ElevenLabs API (Text To Search, Voice Settings,...) integration

  • Activation of more and more Elevenlabs Vocal Settings

What's next for MySupportAgent

  • Activate Paid Subscription for ElevenLabs API usage (Free Plan limited for local and personal Development)

  • Introduce Google Cloud Storage Bucket to store audio files for listening

Built With

Share this project:

Updates