Inspiration
When I was FaceTiming my friend who's working as an E.M.T., he mentioned how chaotic a paramedic's job can get. When he told me this, the first thing that came to mind was not 'AI assistant in a hackathon,' but when brainstorming for this event, I quickly realized how AI could be helpful in such medical situations where every second is critical and every piece of information needs to be perfect. This is why we created Nos.
What it does
Nos listens to the paramedic and to the patient when they can speak. It builds a live, structured picture of the case as it happens. It tracks what's been said and done, automatically identifies medications / other objects from vials or labels held up to the camera, and runs a safety check in the background that catches what would otherwise slip through: a symptom mentioned once and never followed up on, a medication about to be given that interacts with something the patient's already on. It works even when the patient is unconscious or can't speak for themselves, since it never depends on them as the only source of truth. The moment the doors open at the ER, it generates a structured handoff report from everything captured along the way.
How we built it
- We split the system into three coordinated layers connected through an event bus, so each piece could be built in parallel without blocking the others. A voice layer transcribes the paramedic-patient conversation in real time.
- An agent layer extracts structured medical facts, looks for relevant information on the internet, builds the timeline, and runs continuous safety checks, including flagging unfollowed-up symptoms and known medication interactions.
- A vision layer runs Claude's vision API against a live camera feed, using motion-and-stillness detection to capture frames automatically when something, e.g. a vial or a medical bracelet, is held steady in view, without requiring the paramedic to do anything by hand. Identified items are cross-referenced against the patient's known medications before anything is administered.
- Everything converges into a single handoff report at the moment of arrival.
We treated privacy as a principal design constraint: raw audio and video are processed in memory and never written to disk or persistent storage. The only data retained is the structured visit record itself (timeline, medications, flags), which is what the handoff report is built from, kept to the minimum necessary for continuity of care. A database of past handoffs is kept for a first responder or medical care provider to access later, with options to delete any handoff and automatically deleting after two weeks. While we've taken steps to address privacy, we believe persistent visit data would possibly need encryption at rest and role-based access controls before real deployment in order to respect a patient's privacy as much as possible.
Challenges we ran into
- Tuning the vision pipeline's capture trigger so it fires reliably on a held-up vial without flooding the system with redundant calls on every frame. We allow the user to capture whenever they think is relevant.
- Keeping the safety agent anchored to real, verifiable gaps (a stated symptom with no follow-up, a known drug interaction) instead of drifting into vague or unfounded clinical judgment. Previously, Nos would alert the user if a patient stated that they were "old," which is something to keep note of but not necessarily a "concern" immediately.
- Figuring out how to account for unconscious or unresponsive patients. Thankfully, we designed a system where the transcription can recognize different speakers, but this doesn't affect the information that gets passed to the handoff.
- Building privacy into the architecture itself. We had difficulty deciding what never gets persisted and what gets shown. This is probably the challenge that we were least expecting to deal with, but arguably it's the most interesting problem for us.
Accomplishments that we're proud of
- A multimodal pipeline that includes voice, vision, and structured reasoning. These all feed into one coherent report rather than three disconnected demos. We designed a safety agent that catches real, demonstrable gaps live, anchored to actual transcript and vision content rather than vague heuristics.
- We've created a privacy policy we can actually defend: by design, no persistent raw audio or video and a push for confidentiality when handling patient data in databases or third party apps.
What we learned
- Specialized agents beat one big prompt. Splitting extraction, timeline-building, safety-checking, and handoff generation into separate agents made each one easier to reason about and debug, even though it meant more coordination overhead through the event bus.
- Multimodal inputs need to actually align with each other. The vision and transcript pipelines only became useful once we cross-referenced them. A vial identified by the camera matters because it's checked against what was said, not as a standalone fact.
- Privacy is a policy that has to be decided before building. Choices like never writing raw audio/video to disk only work if they're baked into the pipeline from the start. Realizing this after the architecture is set is much harder than designing for it from the first hour.
- The line between "assisting a first responder" and "replacing their judgment" is surprisingly thin. We had to actively rework early ideas (like flagging based on a patient's age alone) that sounded helpful but were really the system making a clinical call it had no real basis for.
What's next for Nos: Ambulance Assistant
- Our primary goal is to move more of the pipeline to fully local, on-device models so nothing leaves the vehicle at all. We experimented with this during the hackathon by hosting some of our agents locally and see it as the clear production direction, particularly for the vision component, which currently uses Claude's hosted VLM API for accuracy.
- For any remaining third-party model usage, production deployment would require formal data agreements, including but not limited to a signed BAA and a no-training guarantee, which is a real legal commitment we haven't pursued at hackathon scale, but is non-negotiable before Nos could be used with real patient data. We'd also want tighter integration with real EHR systems, and more rigorous validation of the safety agent's flagging accuracy against real EMS protocols rather than our own judgment and what we thought was accurate.
Log in or sign up for Devpost to join the conversation.