Inspiration
We noticed that most AI assistants and chatbots are reactive — they wait for users to ask something. But busy people like managers, founders, students don’t need “another app to talk to.” They need something that anticipates their day, handles routine tasks and proactively keeps them on track like a real executive assistant.
We were inspired by the idea of Jarvis from Iron Man not as a voice chatbot but as a system that observes context, decides what matters and takes action automatically. Echo AI is our step toward that reality.
What it does
Echo AI is a voice-first proactive executive assistant deployed on Google Cloud Run using the Google Agent Development Kit (ADK).
Echo can:
Listen to natural voice commands and understand intent
Schedule or reschedule meetings using Google Calendar
Summarize unread emails and draft replies
Fetch and summarize relevant daily news
Calculate commute time and tell you when to leave based on live traffic
Generate a Morning Briefing that automatically summarizes your schedule, travel time, priority emails, and news — without you asking.
The goal: You start your day already prepared.
How we built it
Google Cloud Run for deploying all backend services
Google ADK multi-agent architecture
Calendar & Commute Agent → Google Calendar + Maps Routes API
News Agent → RSS + Gemini summarization via AI Studio
Email Agent → Gmail API
Orchestrator Agent → decides what information is relevant
Google Cloud Speech-to-Text for real-time transcription from the browser microphone
Google Cloud Text-to-Speech for natural spoken responses
WebSockets for low-latency audio streaming and conversational feedback
UI built in React + Vite with a clean minimal “ripple” voice interface
Cloud Run Job + Cloud Scheduler to trigger the Morning Brief every day
The entire system runs serverless, scales automatically, and requires no manual infrastructure management.
Challenges we ran into
Low latency audio streaming was challenging combining WebM/Opus browser encoding with server-side STT streaming required careful buffer handling.
Ensuring proactive behavior without being interruptive designing the right triggers and thresholds for when the agent should speak.
Also fine tuning summarization prompts so news briefings remained factual, concise and contextual instead of generic.
Accomplishments that we're proud of
We successfully built a system that feels agentic so Echo does things on its own rather than waiting for input.
The Morning Briefing feature turned out to be both powerful and surprisingly natural to use.
We deployed a multi-agent architecture entirely on Cloud Run with smooth communication and real-time voice interaction.
What we learned
Building an agent that acts intelligently is less about model complexity and more about context modeling and trigger design.
Real-time voice applications depend heavily on streaming architecture not just LLM quality.
Proactivity is a UX problem first so the agent must help without being intrusive.
Google Cloud Run + ADK makes multi agent orchestration surprisingly clean compared to traditional server stacks.
What's next for Echo AI
Long-term personal memory (preferences, habits, communication style, recurring patterns)
Meeting intelligence (auto notes + action item extraction)
Deep workspace integrations: Slack, Notion, Jira, Teams
Adaptive tone voice synthesis depending on time of day something like a calm morning , focused work sessions and a evening EOD wrap ups.
Mobile first app with continuous lightweight listening mode
Built With
- javascript
- python
- react
- tts
- websocket
Log in or sign up for Devpost to join the conversation.