-
-
System Architecture: Agenticum G5 Genius — Gemini Live API, 52-node neural mesh, Cloud Run backend, React frontend on Firebase Hosting
-
Vertex AI Studio integrated — Gemini 2.5 Flash Native Audio for voice, Imagen 3 for visuals. Full multimodal AI stack on GCP.
-
Cloud Build history — 20+ automated deployments to Cloud Run europe-west1. CI/CD pipeline fully operational in production.
-
Vertex AI API — Imagen 3 image generation via PredictionService.GenerateContent, powering DA-03 Architect agent in real-time.
-
Cloud Run genius-backend — live request metrics, auto-scaling, min-instances=1 for zero cold-start. Production GCP deployment.
Inspiration
The modern marketing world drowns in data but starves for real-time intelligence. We asked ourselves: what if an AI agent could listen, think, and act — simultaneously — like a seasoned marketing strategist available 24/7?
That question sparked Agenticum G5 Genius: a voice-first, real-time AI live agent purpose-built for online marketing management, powered entirely by Google's Gemini 2.0 Flash Live API.
What it does
Agenticum G5 Genius is a real-time AI live agent that:
- Listens and responds via voice using Gemini 2.0 Flash Live (bidirectional audio streaming)
- Orchestrates 52 specialized AI nodes in a modular neural mesh architecture — each node handling a distinct marketing domain (SEO, content, ads, analytics, competitor research, and more)
- Generates autonomous marketing strategies on the fly, adapting to live user input
- Delivers actionable insights across keyword research, campaign optimization, pillar page generation, and audience targeting
- Integrates with Google Cloud services end-to-end: Vertex AI, Cloud Run, Firebase Hosting, Secret Manager
Users simply speak to the agent — and within seconds receive intelligent, structured marketing guidance as if talking to a senior strategist.
How we built it
Frontend: React + TypeScript, deployed on Firebase Hosting. The UI features a voice-first interface with real-time audio visualization, live transcript streaming, and a responsive chat panel.
Backend: Python + FastAPI on Google Cloud Run. We implemented a WebSocket server that maintains bidirectional audio streams with Gemini 2.0 Flash Live API (gemini-2.0-flash-live-001), handling PCM audio encoding/decoding in real time.
AI Architecture: A 52-node modular neural mesh where each node is a specialized Gemini agent. Nodes communicate through an orchestration layer that routes queries, aggregates results, and synthesizes coherent responses — all within the latency constraints of a live voice session.
Infrastructure:
- Google Cloud Run (auto-scaling backend)
- Firebase Hosting (global CDN frontend)
- Vertex AI (model access and management)
- Google Secret Manager (API key security)
- Cloud Build (CI/CD pipeline)
Challenges we ran into
- Real-time audio latency: Achieving sub-300ms round-trip latency with WebSocket audio streaming required careful PCM buffer management and stream multiplexing.
- 52-node orchestration: Coordinating dozens of specialized agents without creating bottlenecks or conflicting outputs demanded a robust priority-queue routing system.
- Voice + text synchronization: Keeping the transcript, audio playback, and visual indicators perfectly in sync across varying network conditions was a significant engineering challenge.
- Cold start optimization: Minimizing Cloud Run cold starts for a latency-sensitive voice application required custom instance warm-up strategies.
Accomplishments that we're proud of
- Built a fully functional voice-first AI marketing agent from scratch during the hackathon
- Achieved real-time bidirectional voice interaction with Gemini 2.0 Flash Live
- Successfully deployed a production-ready system on Google Cloud infrastructure
- Designed and implemented a 52-node neural mesh that scales gracefully
- Created an intuitive, polished UI that makes enterprise AI feel accessible
What we learned
- The Gemini 2.0 Flash Live API opens entirely new UX paradigms — voice is not just an interface, it's a superpower for AI agents
- Modular agent architectures are far more resilient and scalable than monolithic AI systems
- Google Cloud's managed services (Cloud Run + Firebase) enable a solo developer to ship production-grade infrastructure incredibly fast
- Real-time AI requires rethinking every layer of the stack — from network protocols to UI rendering
What's next for Agenticum G5 Genius
- Multimodal expansion: Adding screen sharing and image analysis so the agent can review landing pages, ad creatives, and dashboards in real time
- Autonomous campaign execution: Connecting to Google Ads and Meta APIs for direct campaign management via voice commands
- Enterprise SaaS launch: White-label version for marketing agencies under the Opus Magnum Media brand
- Expanded node mesh: Growing from 52 to 100+ specialized nodes covering the full digital marketing ecosystem
- Multi-language support: Extending voice interaction to German, Spanish, French, and Arabic markets
Built With
- audio
- cloud-build
- fastapi
- firebase-hosting
- gemini-2.0-flash-live-api
- google-cloud-run
- google-secret-manager
- pcm
- python
- react
- typescript
- vertex-ai
- websockets
Log in or sign up for Devpost to join the conversation.