Indra

Indra detecting procedural errors in real time
Front-end home page
Full multi-agent pipeline
Societal Impact

Inspiration

Indra started with a personal frustration and a systemic observation.

Syon spent years doing wet lab research - work that earned him an ISEF Grand Award with implications in obesity research and eating disorders. He understands, from the bench up, how much a single procedural error matters. inadequate aseptic techniques. A wrong buffer. A missed step. An insufficient mix.

These aren't abstract failures - in a clinical setting, they are the difference between a valid result and a misdiagnosis or faulty drug... like the NECC contamination outbreak that killed 64 people.

The deeper motivation is personal. Syon is a first-generation college student from a low-income background. His sister works in healthcare equity. He's watched how systemic failures in healthcare land hardest on patients who can least absorb them - people in under-resourced community hospitals, rural clinics, smaller diagnostic labs without full supervision and quality teams.

The problem isn't just financial waste. It's that a wrong reagent in a diagnostic lab in an underserved neighborhood produces a misdiagnosis that a patient spends years, and money they don't have, trying to unwind.

Syon came into this hackathon wanting to tackle healthcare equity from the angle he's best positioned to address: the technology side, where procedures meet patients. He brought friends that could actually execute - Prit, a 4x published CV researcher with 20+ citations, and Ali, who helped ship ChronoOS to $250K ARR building production financial software for small businesses that can't afford full engineering teams. The domain knowledge, the research depth, and the product instinct to deploy something real.

What it does

Indra is a multi-agent AI quality supervisor for clinical and regulated labs. Upload any written lab protocol and Indra parses it into structured visual checkpoints automatically. Point cameras at the procedure. A coordinated pipeline of 6 autonomous specialist agents monitors execution in real time, flags deviations the moment they occur, and generates a full deviation report with timestamps, SOP citations, and video evidence clips ready for QA review.

In today's presentation: a demo diagnostic immunoassay buffer preparation procedure - similar to the kind performed daily in hospital reference labs. Two main deviations injected that Indra catches live: improper surface sterilization and insufficient mixing.

How we built it

The core architecture is a hierarchical multi-agent pipeline where each agent has a distinct perception model, decision logic, and communication protocol. They coordinate through a shared Monotonic Ledger - a single source of truth that prevents agents from contradicting each other.

We started with the obvious approach: send everything to a vision-language model and let it reason over video. It worked, but at 5+ seconds per clip it was far too slow and expensive for a real clinical environment running hundreds of procedure steps per session.

The breakthrough was routing. We built a fast System 1 layer - YOLO-World running open-vocabulary object detection at ~5ms per frame, locally, on every camera feed. Most SOP steps have unambiguous visual signatures - gloves present or absent, buffer color correct or wrong. The Routing Agent evaluates cross-camera evidence and makes one of three decisions: CONFIRMED (log it, move on), NEEDS_VLM (ambiguous, escalate to Gemini Flash for temporal video reasoning), or ABSENT (unanimous agreement across cameras — immediate violation). This reduced VLM calls by 10–50x, making real-time operation practical.

The dual-camera setup introduced a coordination challenge: Camera 1 sees gloves, Camera 2 doesn't. The cross-camera consensus rule: absence must be unanimous across all cameras with visibility for that step. No single camera can produce a false negative. For physically unobservable steps - a centrifuge with a closed lid looks identical running or stopped - a formal logic Inference Agent deduces what must have occurred: if load and retrieve are both confirmed, the run step happened between them.

Every human review of a flagged deviation feeds back into the system. The next time Indra sees a similar procedure, it has concrete examples of what failure looks like at that specific facility. It gets smarter with every audit.

Challenges

Chain dependency in state management. The Monotonic Ledger's strict state machine meant that a serialization bug in one agent could silently corrupt every downstream decision. We built 25+ typed Pydantic models for inter-agent communication to make this explicit and auditable.

YOLO thread safety. Concurrent YOLO-World inference on Apple MPS caused text-embedding tensor corruption - all detections collapsed to a single class. A threading lock serializing inference calls while keeping I/O parallel fixed it. Non-obvious, took hours to find.

Counting steps require temporal reasoning. "Invert 7 times" is a temporal pattern, not a spatial one. Frame-level detection can't count repetitions. The solution: force VLM escalation for counting steps with an explicit counting rubric injected into the prompt.

Inter-agent conflict resolution. YOLO and Gemini sometimes disagreed on the same step. The Ledger's asymmetric rule: Gemini can upgrade a violated state to confirmed (it has full video context), but YOLO cannot downgrade a Gemini-confirmed step (it only sees a single frame). This hierarchy reflects each agent's actual epistemic capability.

What we're proud of

A system that genuinely works - catching real deviations, in real time, on a real clinical procedure — with an agent architecture that's principled rather than bolted together. Every design decision maps to a real constraint: speed, cost, accuracy, auditability.

What's next

Replace the cloud VLM with a local model so no data ever leaves the facility - fully on-premise deployment for HIPAA-regulated environments.

The longer arc: give every diagnostic lab, community health center, and hospital compounding pharmacy the quality infrastructure that today only large institutions can afford.

Healthcare equity isn't just about who has access to care. It's about whether the procedure behind the result was executed correctly - regardless of whether you're at Mass General or a rural clinic. That's the problem Indra is built to solve.

Built With

cloudflare
fastapi
gemini
microsoft-teams
next.js
opencv
pydantic
python
react
tailwind
typescript
yolo

Submitted to

YHack Spring 2026

Created by

I built the full Next.js + TypeScript frontend styled as a dark lab terminal HUD with a live SOP database, dual-camera video playback synced to a step-by-step execution log, an interactive deviation evidence viewer with annotated frames, and a CAPA report generation system with functional QA sign-off. I also built the SOP parsing pipeline and the full QA alerting workflow, integrating Microsoft Teams via Incoming Webhooks with custom Adaptive Cards that push real-time deviation alerts with dual-camera evidence, SOP citations, and one-click Confirm/Dismiss buttons. The system gates the final CAPA summary until all deviations are reviewed, closing the loop on the compliance workflow.

Ali Khalfan
Original idea of Indra, the frontend and UI, filming the lab footage, and presentation. Couldn't have asked for a better team for this technical sprint. This weekend was v1, can't wait to see where it goes next :)

https://github.com/aliabbaskhalfan/Indra-MVP-Yale

Private user
I worked on the Computer Vision Pipeline taking videos of different camera angles as input and using YoloV8 - Google Gemini Flash based consensus Ledger to identify and detect the SOP deviation.

Prit Mhala