Dr. Cloud

project thumbnail
UI Screen Chat Interface
UI Screen with Chat
UI Screen Homepage
Firestore saving of SOAP files
cloud run deploy
architecture diagram
proof of gpu usage

Inspiration

At first, I struggled to come up with project ideas for this hackathon project and found myself at home sick with the flu 😞. I remember looking up various home remedies and other nonsense details on WebMD and then it struck me, I am sure millions of people use this platform. I could make a chatbot that acts as a WebMD pretty easily and use that as my project base. And as the project took shape, I realized that beyond being a public-facing health assistant, Dr. Cloud could serve as a powerful tool for physicians — helping them take structured notes, generate patient summaries, and even maintain longitudinal health records all with the power of Cloud Run, ADK & A2A, and Firestore!

What it does

Dr. Cloud is an AI-powered healthcare assistant that serves both patients and physicians.

For patients: it answers health questions, explains medical terms, and provides symptom guidance in plain language.
- For physicians: it automates clinical documentation, summarizes patient visits, and maintains longitudinal health records in Firestore.

Powered by Google Cloud Run with NVIDIA L4 GPUs, ADK (Agent Development Kit), and A2A (Agent-to-Agent) communication, Dr. Cloud uses specialized AI agents to handle medical reasoning, lab interpretation, and clinical note generation — all running serverlessly and efficiently.

How we built it

We built Dr. Cloud using a multi-agent architecture powered by Google’s ADK: Backend: Cloud Run (GPU-enabled) running MedGemma via Ollama for medical intelligence. Agents: 6 specialized ADK sub-agents (symptom, lab, medication, lifestyle, referral, documentation) orchestrated by a root agent. A2A integration: Exposes Dr. Cloud’s capabilities for collaboration with other AI agents. Persistence: Firestore stores session data, SOAP notes, and FHIR records for longitudinal health tracking. Frontend: Streamlit app deployed on second Cloud Run, with smart JSON-to-plain-language handling for natural conversations.

Challenges we ran into

Getting MedGemma running on Google Cloud Run with GPUs was no small feat.

GPU cold starts are still an issue :( as I don't want to keep a cloud run instance permanently active as that defeats the point of being a serverless service.
Ollama deployment quirks: Running Ollama in a containerized, stateless environment was tricky since Cloud Run doesn’t persist local storage — GCS became the lifesaver for storing model weights.
Environment tuning: Finding the right mix of --gpu-type nvidia-l4, --memory 32Gi, and --concurrency 4 took iteration to achieve stable inference under load.
ADK callbacks & Firestore integration: Coordinating multiple agents while keeping data persistent required carefully managing asynchronous callbacks. Getting everything to talk to Firestore cleanly was one of those “aha!” moments after reading more about Agent Callbacks in ADK (real lifesaver).

Accomplishments that we're proud of

Became (possibly!) the first developer to deploy a fine-tuned Gemma-based model (MedGemma) on Google Cloud Run GPUs 🎉. (*See here there is no container image for MedGemma https://github.com/google-gemini/gemma-cookbook/blob/main/Demos/Gemma-on-Cloudrun/README.md)
Successfully combined Ollama, ADK, A2A, and Firestore into a single, fully serverless multi-agent healthcare platform.
Integrated after_model_callback in ADK to store structured patient data in Firestore — enabling real, queryable longitudinal health tracking.
Created a GPU-accelerated backend that runs high-performance LLM inference without Kubernetes or manual infrastructure management.
Built a foundation for agentic interoperability — Dr. Cloud can now communicate with other agents or external systems via A2A, opening the door for collaborative healthcare AI networks.

What we learned

Building Dr. Cloud taught me more than any tutorial could:

Serverless GPUs are a game changer — Once the deployment pain is over, Cloud Run + L4 GPUs make it ridiculously simple to scale real-time AI.
ADK makes complex agent systems possible — The Agent Development Kit’s orchestration, callbacks, and sub-agent routing handled what would’ve otherwise been hundreds of lines of boilerplate logic.
First time using Firestore as well and am reasonably improessed- it is surprisingly powerful — I didn’t expect a NoSQL database to fit clinical data so naturally. It’s also serverless, which is a double bonus!

What's next for Dr. Cloud

This was just prototyping for sure. Here are some future nice to haves for sure: 🌐 Multi-language support — Bring healthcare access to non-English speakers everywhere. 🗣️ Voice interaction — Enable hands-free use for patients and clinicians on the go. 🧠 Vertex AI Memory Bank integration — Persistent, privacy-safe longitudinal chat history and patient memory. 🖼️ Image analysis — Interpret rashes, scans, or lab reports directly from photos or PDFs. 🤝 Expanded A2A interoperability — Make Dr. Cloud discoverable and callable by other AI agents, creating a true ecosystem of medical assistants and research bots.

Built With

a2a
adk
cloudrun
firestore
gemini
gemini-cli
medgemma
python

Updates

Abish Pius started this project — Nov 09, 2025 04:25 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.