Inspiration
My mother passed away in 2024. For the last years of her life, she had Alzheimer's, diabetes, hypertension, hyperlipidaemia, and acute anaemia — managed across three specialists. My father, 85, carried her prescriptions, lab reports, and medicine strips in a small cloth bag — a potli — from doctor to doctor across Kolkata.
He is not unusual. 600 million Indians manage their health records this way.
During a previous project (CareMap, submitted to the Kaggle MedGemma Impact Challenge), I spoke with Dr. Manini Moudgal, a physician in India who sees over 100 patients a day. She said: "Don't build a system that adds to the burden. Build one that removes it." That became the entire design brief for Potli.
What it does
Potli turns a smartphone camera into a pocket EHR. A caregiver photographs medicine strips, an Aadhaar card, and a lab PDF — Potli extracts, verifies, and assembles a structured patient chart a doctor can read in 30 seconds. No typing. No login. No workflow change for the doctor.
The discovery that proved the design: My father collected test strips from a relative in Hooghly and sent one photo. Potli processed all four strips and flagged an expired Rosuvastatin — a heart statin she had been taking for months without realising it had expired. Expiry dates are printed in small English type on Indian blister packs. She did not read English fluently. Nobody had checked.
How we built it
Three-agent consensus pipeline on Amazon Nova (AWS Bedrock):
- Agent A: Nova Lite (
amazon.nova-lite-v1:0) — multimodal strip extraction - Agent B: Nova Pro (
amazon.nova-pro-v1:0) — independent multimodal extraction - Agent C: Nova Lite (Judge) — text-only reconciliation, per-field confidence scoring (0–100), human_review flag when confidence < 80
Full AWS stack:
- EC2 t3.medium — FastAPI + uvicorn + nginx
- Amazon DynamoDB —
potli-patientstable - Amazon S3 — strip images + lab PDFs
- IAM instance role — no hardcoded credentials
- NIH RxNorm REST API (free) — drug standardization + duplicate detection
Prompt engineering: All 7 AI prompts use a structured reflection pattern: OBSERVE → EXTRACT → VALIDATE → RETURN. This reduced hallucinations dramatically — prescription extraction went from 16 entries (12 hallucinated) to 4 correct entries on the same image.
Lab extraction: pdftoppm slices PDFs page by page → Nova Pro extracts test values with H/L flags, cross-checked against reference ranges.
Infrastructure as code: Full Terraform in infra/ — EC2, security group,
Elastic IP, IAM role.
Challenges we ran into
The MFG vs EXPIRY problem: Indian blister strips print both manufacture date and expiry date on the same foil line (e.g. "MFG.JUL.2025 EXPIRY JUN.2027"). Single-model extraction consistently confused the two. The dual-agent + judge architecture, combined with explicit label disambiguation in the prompt, was the only approach that reliably caught this.
Multi-strip judge hallucination: When the judge received the original image alongside both agent outputs for a 4-strip scan, it would re-extract the strips itself — poorly — overriding correct extractor values. The fix: make the multi-strip judge text-only. No image resend. This was a non-obvious architectural decision that only emerged through testing.
Nova 2 testing: We tested us.amazon.nova-2-lite-v1:0 (Nova 2 Lite inference
profile). The model invoked successfully but produced degraded results on Indian
blister strips — expiry dates missed, phantom strips hallucinated. The extraction
prompt was tuned against Nova v1 behavior. We reverted to preserve demo reliability.
Nova 2 prompt re-tuning is on the v2 roadmap.
Accomplishments that we're proud of
Catching the expired medicine. A real relative in a real village was taking expired Rosuvastatin — a heart statin — for months without knowing. One photograph. One scan. Potli found it. That is the whole point, and it worked.
37 passing tests. 23 unit tests (fully mocked, no AWS) + 14 integration tests against real Bedrock calls. The pipeline is tested, not just demonstrated.
The prompt engineering breakthrough. Going from 16 hallucinated extraction entries to 4 correct ones — on the same image, same model — purely through the OBSERVE → EXTRACT → VALIDATE → RETURN reflection pattern. No fine-tuning. No labelled dataset. Just structured prompting.
$0.02 per complete patient record. Aadhaar scan + 5 medicine strips + 4-page lab report. The full AI pipeline — two Nova models + judge + RxNorm — costs less than two cents. A village of 500 patients costs $116 a year.
Zero doctor friction. The doctor never logs in, never opens Potli, never changes their workflow. The caregiver does everything. The doctor reads a card. This design constraint — validated by a practising physician — is what makes adoption realistic.
Full end-to-end in a hackathon week. Agentic pipeline, human review flow, doctor chart page, lab extraction, S3 source image storage, Terraform IaaC, EC2 deployment — live at http://54.235.170.250/demo.html.
What we learned
Foundation models like Amazon Nova compressed what would have taken months of fine-tuning into a structured prompt. A three-agent consensus pipeline that cross-verifies extraction, scores confidence, and flags human review — built in days, without a labelled dataset, without a custom model. That is the real breakthrough here.
The hardest problems were not the AI calls — they were the architectural decisions that emerged only through testing: text-only judge for multi-strip, position-based pairing when agents return the same strip count, null-safety at every layer.
What's next for Potli PocketEHR
- Nova 2 evaluation — re-tune the extraction prompts for
nova-2-lite-v1:0andnova-2-pro-v1:0and run head-to-head accuracy benchmarks - Voice vitals — Hindi/Hinglish BP, weight, and chief complaint via Whisper STT + Nova Lite structuring (code exists, excluded from EC2 to reduce deploy weight)
- Drug interaction alerts — flag dangerous combinations using RxNorm interaction API
- Prescription OCR — handwritten doctor notes → structured medication list
- ASHA worker offline mode — local-first capture, sync when connected
- Systematic evaluation — labelled test set across strip types, lighting conditions, and model configurations to find the optimal Nova pipeline
Live demo: http://54.235.170.250/demo.html
Built With
- amazon-bedrock
- amazon-dynamodb
- amazon-ec2
- amazon-nova-lite
- amazon-nova-pro
- amazon-web-services
- aws-iam
- fastapi
- nginx
- nih-rxnorm-api
- pypdf
- python
- tailwind-css
- terraform
- uvicorn
- vanilla-javascript
Log in or sign up for Devpost to join the conversation.