Design Architecture
Multi Agent Architecture

Real-Time Emergency Detection & Insight Generation

Inspiration

Healthcare emergencies demand immediate action — every second counts when a patient's vitals indicate a life-threatening condition. Traditional monitoring systems often rely on manual observation or periodic checks, creating dangerous gaps in detection and response time. I envisioned a solution where AI could continuously monitor vital signs in real-time, instantly detect anomalies, alert medical staff through their existing communication channels, and provide comprehensive insights to support rapid decision-making.

The inspiration came from recognizing that this challenge extends far beyond healthcare. Manufacturing floors need real-time equipment failure detection. Smart cities require instant infrastructure monitoring. Supply chains demand immediate risk alerts. The common thread? The need for intelligent systems that can detect, notify, and enable action — all in real-time. This project was born from the vision of creating a universal architecture for emergency detection that could save lives, prevent disasters, and enable proactive responses across any domain.

What it does

Real-Time Emergency Detection and Insight Generation is an intelligent AI-powered system that continuously monitors live data streams from IoT devices, detects potential emergencies in real-time, immediately notifies users through Slack, and provides an interactive AI Assistant for deep insights and automated reporting.

Key Capabilities:

🔹 Real-Time Risk Detection

Continuously ingests streaming data from IoT sensors via Pub/Sub
Classifies incoming vitals/metrics against predefined risk thresholds
Detects anomalies and potential emergency conditions in milliseconds
Triggers immediate Slack notifications when risks are identified

🔹 Intelligent Multi-Agent AI Assistant

Powered by Google's Agent Development Kit (ADK) with 8 specialized agents:
- Root Agent: Orchestrates user interactions and conversation flow
- Main Agent: Coordinates multi-agent workflows and aggregates results
- Retrieval Agent: Performs hybrid SQL + embedding-based searches
- Analysis Agent: Computes statistical metrics (mean, min, max, std deviation)
- Effects Analysis Agent: Matches vitals to clinical conditions and potential effects
- Report Agent: Generates comprehensive PDF reports with visualizations
- Uploader Agent: Manages cloud storage and generates accessible URLs
- Memory Manager: Maintains contextual state for multi-turn conversations

🔹 Smart Query Processing

Natural language understanding for queries like "Show me the latest readings for patient P100"
Rule-based classification for common patterns (optimized for speed)
LLM-powered fallback for complex or ambiguous queries
Contextual memory enables follow-up queries without repeating patient IDs

🔹 Comprehensive Clinical Insights

Effects analysis automatically detects conditions like:
- Bradycardia (slow heart rate)
- Tachycardia (rapid heart rate)
- Hypertension (high blood pressure)
- Hypotension (low blood pressure)
- Hypoxemia (low oxygen levels)
Provides potential effects and recommended clinical actions
Statistical analysis of vital trends over time

🔹 Professional Reporting

Auto-generated 3-page PDF reports with:
- Patient vital signs summary
- Time-series visualizations (heart rate, blood pressure, oxygen levels)
- Detected conditions and potential effects
- Statistical analysis and trend insights
Downloadable reports for medical records and team collaboration

🔹 Custom Web Interface

Modern, responsive City Hospital-branded UI
Real-time chat interface with the AI Assistant
Quick action buttons for common queries
Color-coded vital signs display
Clinical condition cards with severity indicators
One-click PDF report downloads

Use Case: ER Patient Vital Monitoring

In the demonstration scenario:

IoT sensors stream patient vitals (heart rate, blood pressure, oxygen levels) to Pub/Sub
Cloud Run subscriber receives data, detects a patient with critically low heart rate (45 bpm)
Slack notification immediately alerts the medical team: "⚠️ Risk detected for Patient P100"
Doctor opens the AI Assistant web interface
Queries: "Show me the latest readings for patient P100"
AI responds with vitals summary + effects analysis (Bradycardia detected)
Doctor requests: "Generate a comprehensive PDF report"
System creates professional report with graphs, uploads to GCS, provides download link
Doctor reviews report and discusses treatment plan with chief physician

Adaptability Across Domains:

This architecture is domain-agnostic and can be applied to:

Manufacturing: Equipment temperature/vibration monitoring → Failure prediction
Smart Cities: Traffic flow/air quality sensors → Congestion/pollution alerts
Supply Chain: Shipment tracking/cold chain monitoring → Delay/spoilage detection
Energy: Grid monitoring/renewable energy systems → Outage/efficiency alerts
Agriculture: Soil moisture/crop health sensors → Irrigation/pest alerts

How we built it

Architecture Overview

The solution leverages a multi-layered Google Cloud architecture with AI-powered intelligence:

Layer 1: Data Ingestion & Streaming

Pub/Sub Topic: Receives continuous IoT sensor data streams
Message Publishing: Sensors publish vital signs (heart rate, BP, oxygen) as JSON payloads
Subscriber Pattern: Cloud Run service subscribes to topic for real-time processing

Layer 2: Real-Time Risk Detection & Notification

Cloud Run Subscriber Service:
- Receives Pub/Sub messages
- Classifies vitals against risk thresholds using predefined rules
- Detects emergency conditions (e.g., heart rate < 50 bpm or > 120 bpm)
- Triggers immediate Slack webhook notifications
- Runs containerized Python service with high availability

Layer 3: Data Persistence & Enrichment

Google Cloud Storage: Stores raw message payloads as JSON files
Cloud Scheduler: Triggers high-frequency Cloud Run jobs for near real-time processing
Embedding Generation:
- Uses Vertex AI text-embedding-004 model
- Converts vital records to vector embeddings
- Enables semantic similarity search
BigQuery Ingestion:
- Dataset: vitals_dataset
- Table: patient_vitals with schema:
- patient_id, timestamp, heart_rate, bp_systolic, bp_diastolic, oxygen_level, embedding (ARRAY)
- Supports both SQL queries and vector cosine similarity search

Layer 4: AI Assistant (Multi-Agent System)

Built with Google ADK (Agent Development Kit) using Gemini 2.0 Flash:

8 Specialized Agents:

Root Agent (ERMonitoringAgent)
- Entry point for user interactions
- Handles ADK protocol communication
- Calls adk_tool_handle_query() to invoke MainAgent
Main Agent (Orchestrator)
- Coordinates all specialized agents
- Manages query routing and response aggregation
- Implements conditional workflows (e.g., PDF generation only when requested)
Retrieval Agent
- Query Classification (Performance-optimized):
  - Rule-based: Detects keywords like "latest", "recent", "show" → SQL mode
  - Embedding-based: Detects "similar", "trends", "patterns" → Vector search
  - LLM fallback: Uses Gemini 2.0 Flash for ambiguous queries
- SQL Generation: Converts natural language → BigQuery SQL using LLM
- Execution: Runs SQL queries or cosine similarity searches
- Summarization: LLM-powered natural language summaries
Analysis Agent
- Computes statistical metrics: mean, min, max, standard deviation
- Provides row counts and data quality insights
Effects Analysis Agent
- Queries effects_master_table with condition ranges:
  - Bradycardia: heart_rate < 60 bpm
  - Tachycardia: heart_rate > 100 bpm
  - Hypertension: bp_systolic > 140 or bp_diastolic > 90
  - Hypotension: bp_systolic < 90 or bp_diastolic < 60
  - Hypoxemia: oxygen_level < 90%
- Matches patient vitals to conditions
- Returns detected conditions, potential effects, and vitals analyzed
Report Agent
- Generates 3-page PDF reports using ReportLab
- Creates time-series visualizations with Matplotlib:
  - Heart Rate over time
  - Blood Pressure trends (systolic/diastolic)
  - Oxygen Level variations
- Formats clinical data with headers, tables, and color-coded sections
Uploader Agent
- Uploads PDFs to GCS bucket: erpatientvitals
- Dual URL strategy:
  - Local: Signed URLs (7-day expiration) using service account key
  - Cloud Run: Public URLs (bucket has allUsers objectViewer role)
- Implements graceful fallback for signed URL failures
Memory Manager
- Maintains last 5 queries in contextual memory
- Tracks patient IDs for multi-turn conversations
- Injects context: "latest readings" + previous patient ID → "latest readings for P100"
- Enables natural follow-up queries

Layer 5: Web Interface

FastAPI Application (fastapi_app.py):
- Endpoints:
- GET / → Serves index.html
- POST /api/query → Processes user queries via MainAgent
- GET /health → Health check for Cloud Run
- GET /api/welcome → Initial greeting
- CORS enabled for browser requests
- Static file serving for CSS/JS assets
Custom Frontend:
- index.html: Responsive chat interface with City Hospital branding
- style.css: Medical-themed design with animated heartbeat logo
- script.js: API calls, message rendering, quick action buttons

Layer 6: Deployment

Docker Containerization:
- Dockerfile: Multi-stage build for FastAPI + ADK dependencies
- Uvicorn ASGI server on port 8080 (Cloud Run) / 8000 (local)
- Health checks via /health endpoint
- .dockerignore: Optimizes build context
Cloud Run Deployment:
- cloudbuild.yaml: Automated CI/CD pipeline
- Build context: ERAgent/ directory
- Environment variables: BQ_DATASET, BQ_TABLE, PROJECT_ID, GCS_BUCKET
- Auto-scaling based on request volume
- Supports both ADK and FastAPI interfaces

Technology Stack

Component	Technology
AI Framework	Google ADK (Agent Development Kit)
LLM	Gemini 2.0 Flash (multimodal)
Embeddings	Vertex AI text-embedding-004
Database	BigQuery (structured + vector data)
Streaming	Pub/Sub (IoT data ingestion)
Storage	Google Cloud Storage (PDFs, raw data)
Compute	Cloud Run (serverless containers)
Scheduling	Cloud Scheduler (near real-time jobs)
Notifications	Slack Webhooks
Web Framework	FastAPI (Python async framework)
Server	Uvicorn (ASGI server)
PDF Generation	ReportLab
Visualization	Matplotlib
Frontend	HTML5, CSS3, Vanilla JavaScript
Deployment	Docker, Cloud Build
Authentication	Service Account (IAM)

Performance Optimizations

Rule-Based Query Classification: Saves 1-2 seconds per query (30-40% faster)
Parallel Agent Execution: Analysis + Effects Analysis run concurrently
Conditional PDF Generation: Only when explicitly requested
GCS Public Bucket: Eliminates signed URL overhead on Cloud Run
Cloud Run Auto-Scaling: Handles variable load automatically

Challenges we ran into

Building a production-ready real-time emergency detection system required overcoming several technical hurdles:

🔧 Query Performance: Initial implementation took 3.5-6.5 seconds per query due to LLM classification overhead. Optimized with rule-based keyword matching first, achieving 30-40% performance improvement (2-4.5 seconds).

🔧 Multi-Agent Orchestration: Managing data flow between 8 specialized agents with dependencies and conditional workflows required careful sequencing: Memory → Retrieval → Parallel (Analysis + Effects) → Conditional (Report + Upload).

🔧 Clinical Accuracy: Effects analysis needed precise medical thresholds. Created comprehensive effects_master_table in BigQuery with exact clinical ranges for conditions like Bradycardia (HR < 60), Hypertension (BP > 140/90), and Hypoxemia (SpO2 < 90%).

Each challenge strengthened the system's reliability, performance, and clinical accuracy — essential for life-critical applications.

Accomplishments that we're proud of

🏆 Built a Production-Ready Multi-Agent AI System

Successfully architected and deployed an 8-agent system using Google ADK that demonstrates real-world applicability. Each agent has a clear single responsibility, enabling modularity, testability, and independent scaling.

⚡ Achieved 30-40% Query Performance Improvement

Optimized query classification with rule-based preprocessing before LLM fallback, reducing average response time from 3.5-6.5 seconds to 2-4.5 seconds — critical for emergency response scenarios.

🔄 Seamless Integration of Google Cloud Services

Created a cohesive architecture combining Pub/Sub, Cloud Run, BigQuery, GCS, Vertex AI, and Cloud Scheduler — all working in harmony for real-time emergency detection and insight generation.

🎨 Custom-Branded Professional Web Interface

Replaced the generic ADK web UI with a fully custom FastAPI application featuring City Hospital branding, responsive design, and an intuitive chat interface with color-coded vitals and clinical condition displays.

🤖 Hybrid AI Approach: Rule-Based + LLM

Balanced speed and intelligence by combining rule-based classification for common patterns with LLM-powered understanding for complex queries — achieving both performance and accuracy.

📊 Automated Clinical Reporting with Visualizations

Built end-to-end PDF report generation with professional formatting, time-series graphs (Matplotlib), clinical insights, and one-click downloads — ready for medical records.

🧠 Contextual Memory for Natural Conversations

Implemented conversation state management that remembers patient IDs across queries, enabling natural follow-ups like "Show me their oxygen levels" without repeating context.

🔐 Dual-Mode Cloud Storage Strategy

Solved the signed URL vs. public URL challenge with graceful fallback: service account keys for local development security, public bucket for Cloud Run simplicity.

🚀 Fully Containerized & Cloud-Native Deployment

Dockerized the entire application with optimized build contexts, health checks, and Cloud Build CI/CD pipeline — deployable to Cloud Run with a single command.

🌐 Domain-Agnostic Architecture

Designed a reusable pattern that extends beyond healthcare to manufacturing, smart cities, supply chains, energy grids, and agriculture — proving the versatility of the solution.

📈 Vector Search + SQL Dual Retrieval

Implemented hybrid search capabilities using BigQuery's vector cosine similarity for semantic queries ("find similar patients") and SQL for structured queries ("latest readings") — best of both worlds.

💬 Real-Time Slack Notifications

Integrated immediate alerting via Slack webhooks, ensuring medical teams receive critical notifications within seconds of risk detection.

🎯 Effects Analysis Automation

Created intelligent condition detection that automatically identifies Bradycardia, Tachycardia, Hypertension, Hypotension, and Hypoxemia — providing actionable clinical insights without manual analysis.

📝 Comprehensive Documentation

Created detailed guides for local testing, Docker deployment, FastAPI implementation, effects analysis, and architecture — ensuring maintainability and knowledge transfer.

🛠️ Problem-Solving Resilience

Systematically debugged and resolved 10+ critical issues (environment variables, GCS uploads, Docker builds, performance bottlenecks) — demonstrating persistence and technical depth.

What we learned

1. Multi-Agent Architecture is the Future of AI Applications

Breaking complex tasks into specialized agents (Retrieval, Analysis, Effects, Reporting) creates maintainable, scalable systems. Each agent can be tested, optimized, and replaced independently — far superior to monolithic approaches.

2. Environment Configuration is Critical

Python import behavior requires careful environment variable management. Setting variables before imports (not after) prevented hours of debugging. This applies to any module with initialization-time dependencies.

3. Performance Optimization Requires Measurement

Rule-based preprocessing for common patterns (30-40% speedup) proved that not every query needs an LLM. The right balance between rules and AI maximizes both speed and intelligence.

4. Cloud-Native Design Requires Fallback Strategies

Signed URLs work locally but fail on Cloud Run without service accounts. Public bucket access works on Cloud Run but lacks security controls locally. Always implement graceful fallbacks.

5. Google ADK Simplifies Agent Development

ADK's base agent pattern, tool integration, and state management eliminated boilerplate code. However, understanding its lifecycle (initialization, tool execution, response handling) requires experimentation.

6. BigQuery is More Than a Data Warehouse

Vector embeddings in ARRAY<FLOAT64> columns + COSINE_SIMILARITY() function enable semantic search without external vector databases. BigQuery unifies structured SQL and unstructured vector search.

7. Contextual Memory Transforms User Experience

Remembering patient IDs across turns (Memory Manager) enables natural conversations: "Show latest readings" → "Generate PDF" → "What about oxygen levels?" Users don't repeat context.

8. FastAPI is Ideal for AI Microservices

Async support, automatic OpenAPI docs, type hints with Pydantic models, and CORS handling made building the REST API effortless. Uvicorn provides production-ready ASGI serving.

9. Docker Build Context Matters

.dockerignore location (root vs. ERAgent/) determines what files are visible. Build context directory (dir: 'ERAgent') in cloudbuild.yaml changes relative paths. Always verify file visibility during builds.

10. Real-Time Systems Need Near Real-Time, Not Instant

Cloud Scheduler triggering Cloud Run every 1-5 minutes provides "near real-time" processing. True real-time (milliseconds) requires streaming pipelines. For emergency detection, sub-minute latency is acceptable.

11. LLM Integration Has Three Key Patterns

Classification: Determine query type (SQL vs. embedding)
Generation: Create SQL from natural language
Summarization: Convert tabular data to human-readable text

All three patterns use the same LLM (Gemini 2.0 Flash) with different prompts.

12. Error Handling Must Be User-Friendly

Instead of exposing stack traces, catch exceptions and return helpful messages: "BigQuery query failed. Please check your dataset configuration." Users need actionable guidance, not technical details.

13. Hybrid Retrieval Beats Pure Vector or Pure SQL

Vector search finds semantically similar records ("patients with similar conditions"). SQL retrieves precise matches ("patient P100's latest reading"). Supporting both is essential.

14. Clinical Domain Knowledge is Non-Negotiable

Accurate effects analysis required understanding medical thresholds: Bradycardia < 60 bpm, Hypertension > 140/90 mmHg, Hypoxemia < 90% SpO2. Generic AI can't replace domain expertise.

15. Cloud Run Auto-Scaling is Powerful but Requires Tuning

Default settings may cause cold starts. Minimum instances (> 0) eliminate cold starts but increase costs. Health checks prevent premature termination. Balance cost vs. latency based on usage patterns.

16. Slack Webhooks Are Simplest for Notifications

No OAuth, no complex APIs — just HTTP POST to webhook URL. Perfect for critical alerts. For richer interactions (buttons, modals), use Slack SDK with bot tokens.

17. PDF Generation Requires Layout Planning

ReportLab needs precise coordinate calculations. Matplotlib graphs must be saved as images before embedding. Three-page structure (summary → graphs → insights) provides comprehensive reports.

18. Public Cloud Storage Has Security Trade-offs

allUsers objectViewer role enables URL access without authentication — convenient but exposes data publicly. For production, use signed URLs with short expiration or Cloud CDN with Cloud Armor.

19. Multi-Turn Conversations Need State Management

Stateless APIs (FastAPI) don't persist memory. ADK's built-in state management handles this, but custom solutions need databases (Firestore, Redis) or session storage.

20. Testing is 50% of Development Time

Testing agent interactions, BigQuery queries, GCS uploads, Slack notifications, PDF generation, and deployment pipelines required systematic validation. Automated tests would save significant time.

What's next for Real-Time Emergency Detection & Insight Generation

🔮 Short-Term Enhancements (Next 3 Months)

1. Advanced Analytics & Predictive Insights

Trend Analysis Agent: Predict patient deterioration 30-60 minutes before critical events using time-series forecasting (Prophet, LSTM)
Anomaly Detection: Use Vertex AI AutoML to detect unusual patterns that don't match predefined thresholds
Risk Scoring: Implement MEWS (Modified Early Warning Score) calculation for holistic patient risk assessment

2. Enhanced Notification System

Multi-Channel Alerts: Expand beyond Slack to SMS (Twilio), email, and mobile push notifications
Escalation Policies: Auto-escalate critical alerts if not acknowledged within 5 minutes
Smart Routing: Send alerts to on-duty staff based on shift schedules and specialization

3. Voice Interface Integration

Voice Queries: "Alexa, what are Patient P100's latest vitals?" using Vertex AI Speech-to-Text + Text-to-Speech
Hands-Free Reporting: Generate reports via voice commands in sterile environments (ORs, ICUs)

4. Real-Time Dashboard

Live Vitals Display: WebSocket-based streaming dashboard showing all patients' current status
Color-Coded Risk Levels: Green (stable), Yellow (warning), Red (critical)
Interactive Charts: Drill-down into individual patient trends with Chart.js/D3.js

5. Mobile App Development

Native iOS/Android Apps: Flutter-based mobile interface for on-the-go access
Offline Mode: Cache recent patient data for review in low-connectivity areas
Biometric Authentication: Fingerprint/Face ID for secure access

🚀 Medium-Term Goals (6-12 Months)

6. Federated Multi-Hospital Deployment

Centralized Monitoring: Aggregate data from multiple hospitals/clinics into regional dashboards
Privacy-Preserving Analytics: Federated learning to train models on distributed data without centralization
Inter-Hospital Transfer Insights: Recommend optimal transfer destinations based on capacity and specialization

7. Advanced AI Agents

Treatment Recommendation Agent: Suggest evidence-based interventions using medical knowledge graphs (PubMed, clinical guidelines)
Drug Interaction Agent: Alert when prescribed medications may conflict with vitals (e.g., beta-blockers + low HR)
Resource Allocation Agent: Optimize bed assignments, staff allocation, and equipment distribution

8. Integration with Electronic Health Records (EHR)

HL7 FHIR Support: Ingest patient history, medications, allergies from EHR systems (Epic, Cerner)
Bi-Directional Sync: Write AI insights back to EHR for permanent medical records
HIPAA Compliance: Implement encryption, audit logs, and access controls for PHI

9. Explainable AI (XAI)

Transparency: Show which vitals triggered specific conditions (e.g., "Bradycardia detected: HR=45 < threshold 60")
Confidence Scores: Display LLM confidence levels for recommendations
Audit Trail: Log all AI decisions for regulatory compliance and error analysis

10. Automated Testing & CI/CD

Unit Tests: Pytest for each agent's core functions
Integration Tests: End-to-end tests simulating Pub/Sub → Cloud Run → BigQuery → GCS workflows
Load Testing: Locust or K6 to validate 1000+ requests/second capacity
Continuous Deployment: Auto-deploy to staging on Git push, production on approval

🌍 Long-Term Vision (1-2 Years)

11. Cross-Domain Expansion

Manufacturing: Predictive maintenance for industrial equipment (vibration, temperature sensors)
Smart Cities: Traffic congestion prediction, air quality monitoring, flood detection
Supply Chain: Real-time shipment tracking, cold chain monitoring, delivery delays
Energy: Grid stability monitoring, renewable energy optimization, outage prediction
Agriculture: Soil moisture, crop health, pest detection, irrigation automation

12. Global Health Surveillance

Pandemic Monitoring: Detect disease outbreaks by analyzing vital trends across populations
Telemedicine Integration: Remote patient monitoring for home-based care
Wearable Device Support: Ingest data from Fitbit, Apple Watch, medical-grade wearables

13. AI Model Customization

Fine-Tuned Models: Train domain-specific LLMs on medical literature for better clinical reasoning
Multi-Modal AI: Combine vitals data with X-rays, ECGs, lab results for holistic diagnostics
Continuous Learning: Implement feedback loops where clinicians rate AI suggestions, improving accuracy over time

14. Regulatory Compliance & Certification

FDA 510(k) Clearance: Certify as a medical device for clinical decision support
ISO 27001: Information security management certification
GDPR Compliance: Data privacy controls for EU deployments

15. Open-Source Community & Ecosystem

Public GitHub Repository: Share core framework for healthcare AI innovation
Plugin Architecture: Allow third-party developers to create custom agents (e.g., pharmacy integration, billing)
Healthcare AI Marketplace: Distribute specialized agents for different conditions (diabetes, cardiac care, trauma)

16. Edge Computing for Latency-Critical Use Cases

IoT Edge Devices: Run lightweight models on hospital edge servers for sub-100ms response times
Federated Deployment: Deploy agents across hospital networks to reduce cloud dependency
5G Integration: Leverage ultra-low latency 5G for ambulance → hospital data streaming

17. Ethical AI & Bias Mitigation

Fairness Audits: Ensure AI doesn't discriminate based on age, gender, ethnicity
Human-in-the-Loop: Critical decisions always reviewed by clinicians (AI assists, humans decide)
Transparency Reports: Publish annual reports on AI performance, errors, and improvements

🎯 Ultimate Goal: Save Lives at Scale

Transform emergency detection from reactive manual monitoring to proactive AI-powered prediction across healthcare, infrastructure, and critical systems globally. By making this architecture open, adaptable, and intelligent, we enable organizations worldwide to deploy life-saving real-time monitoring in days, not years.

The future is real-time. The future is intelligent. The future is now. 🚀

Built With

adk
bigquery
cloudrun
docker
gcp
gcs
gemini
google
llm
pub/sub
python
slack
textembedding
vectorcosinesearch