potli-pocketEHR

Illustration of the run.
Patient Chart Recreated from the Medicine Images and Lab Report Pages.

Inspiration

My mother passed away in 2024. For the last years of her life, she had Alzheimer's, diabetes, hypertension, hyperlipidaemia, and acute anaemia — managed across three specialists. My father, 85, carried her prescriptions, lab reports, and medicine strips in a small cloth bag — a potli — from doctor to doctor across Kolkata.

He is not unusual. 600 million Indians manage their health records this way.

During a previous project (CareMap, submitted to the Kaggle MedGemma Impact Challenge), I spoke with Dr. Manini Moudgal, a physician in India who sees over 100 patients a day. She said: "Don't build a system that adds to the burden. Build one that removes it." That became the entire design brief for Potli.

What it does

Potli turns a smartphone camera into a pocket EHR. A caregiver photographs medicine strips, an Aadhaar card, and a lab PDF — Potli extracts, verifies, and assembles a structured patient chart a doctor can read in 30 seconds. No typing. No login. No workflow change for the doctor.

The discovery that proved the design: My father collected test strips from a relative in Hooghly and sent one photo. Potli processed all four strips and flagged an expired Rosuvastatin — a heart statin she had been taking for months without realising it had expired. Expiry dates are printed in small English type on Indian blister packs. She did not read English fluently. Nobody had checked.

How we built it

Three-agent consensus pipeline on Amazon Nova (AWS Bedrock):

Agent A: Nova Lite (amazon.nova-lite-v1:0) — multimodal strip extraction
Agent B: Nova Pro (amazon.nova-pro-v1:0) — independent multimodal extraction
Agent C: Nova Lite (Judge) — text-only reconciliation, per-field confidence scoring (0–100), human_review flag when confidence < 80

Full AWS stack:

EC2 t3.medium — FastAPI + uvicorn + nginx
Amazon DynamoDB — potli-patients table
Amazon S3 — strip images + lab PDFs
IAM instance role — no hardcoded credentials
NIH RxNorm REST API (free) — drug standardization + duplicate detection

Prompt engineering: All 7 AI prompts use a structured reflection pattern: OBSERVE → EXTRACT → VALIDATE → RETURN. This reduced hallucinations dramatically — prescription extraction went from 16 entries (12 hallucinated) to 4 correct entries on the same image.

Lab extraction: pdftoppm slices PDFs page by page → Nova Pro extracts test values with H/L flags, cross-checked against reference ranges.

Infrastructure as code: Full Terraform in infra/ — EC2, security group, Elastic IP, IAM role.

Challenges we ran into

The MFG vs EXPIRY problem: Indian blister strips print both manufacture date and expiry date on the same foil line (e.g. "MFG.JUL.2025 EXPIRY JUN.2027"). Single-model extraction consistently confused the two. The dual-agent + judge architecture, combined with explicit label disambiguation in the prompt, was the only approach that reliably caught this.

Multi-strip judge hallucination: When the judge received the original image alongside both agent outputs for a 4-strip scan, it would re-extract the strips itself — poorly — overriding correct extractor values. The fix: make the multi-strip judge text-only. No image resend. This was a non-obvious architectural decision that only emerged through testing.

Nova 2 testing: We tested us.amazon.nova-2-lite-v1:0 (Nova 2 Lite inference profile). The model invoked successfully but produced degraded results on Indian blister strips — expiry dates missed, phantom strips hallucinated. The extraction prompt was tuned against Nova v1 behavior. We reverted to preserve demo reliability. Nova 2 prompt re-tuning is on the v2 roadmap.

Accomplishments that we're proud of

Catching the expired medicine. A real relative in a real village was taking expired Rosuvastatin — a heart statin — for months without knowing. One photograph. One scan. Potli found it. That is the whole point, and it worked.

37 passing tests. 23 unit tests (fully mocked, no AWS) + 14 integration tests against real Bedrock calls. The pipeline is tested, not just demonstrated.

The prompt engineering breakthrough. Going from 16 hallucinated extraction entries to 4 correct ones — on the same image, same model — purely through the OBSERVE → EXTRACT → VALIDATE → RETURN reflection pattern. No fine-tuning. No labelled dataset. Just structured prompting.

$0.02 per complete patient record. Aadhaar scan + 5 medicine strips + 4-page lab report. The full AI pipeline — two Nova models + judge + RxNorm — costs less than two cents. A village of 500 patients costs $116 a year.

Zero doctor friction. The doctor never logs in, never opens Potli, never changes their workflow. The caregiver does everything. The doctor reads a card. This design constraint — validated by a practising physician — is what makes adoption realistic.

Full end-to-end in a hackathon week. Agentic pipeline, human review flow, doctor chart page, lab extraction, S3 source image storage, Terraform IaaC, EC2 deployment — live at http://54.235.170.250/demo.html.

What we learned

Foundation models like Amazon Nova compressed what would have taken months of fine-tuning into a structured prompt. A three-agent consensus pipeline that cross-verifies extraction, scores confidence, and flags human review — built in days, without a labelled dataset, without a custom model. That is the real breakthrough here.

The hardest problems were not the AI calls — they were the architectural decisions that emerged only through testing: text-only judge for multi-strip, position-based pairing when agents return the same strip count, null-safety at every layer.

What's next for Potli PocketEHR

Nova 2 evaluation — re-tune the extraction prompts for nova-2-lite-v1:0 and nova-2-pro-v1:0 and run head-to-head accuracy benchmarks
Voice vitals — Hindi/Hinglish BP, weight, and chief complaint via Whisper STT + Nova Lite structuring (code exists, excluded from EC2 to reduce deploy weight)
Drug interaction alerts — flag dangerous combinations using RxNorm interaction API
Prescription OCR — handwritten doctor notes → structured medication list
ASHA worker offline mode — local-first capture, sync when connected
Systematic evaluation — labelled test set across strip types, lighting conditions, and model configurations to find the optimal Nova pipeline