Inspiration

Healthcare access is uneven, and many people search symptoms on the web before they ever talk to a clinician. That flow often trades privacy for convenience: raw health text goes to servers you do not control. I was inspired by the Catalyst for Care to prove a different model: symptom guidance that can run entirely on the phone, with low latency, clear triage-style output, and no cloud inference for the core assistant.

I also cared about real usability beyond a chat box: nearby care (clinics / ER context), medication reminders that stay local, and a history of what the user already explored—so the app feels like a small health companion, not a one-off demo.

What it does

MediMatch is an iOS app that helps users think through symptoms in plain language and returns a single structured result (not a multi-turn chat): severity, short guidance, possible explanations with confidence, red flags when present, and next steps (including when to seek urgent or emergency care). Users can type, tap common symptom chips, and use on-device speech where supported.

The Triage flow runs the on-device pipeline: quick safety checks, a prompt guard model, and a triage LLM that emits both human-readable text and a MEDIMATCH_JSON block the app parses into cards. History stores past runs locally. Clinics uses MapKit to help users find care nearby (this path uses the network for search). Medications supports local reminders via notifications. Settings surfaces privacy (what stays on device), accessibility, and model install status. The app is informational only—it does not diagnose or replace a clinician.

How I built it

Stack: Native Swift / SwiftUI, iOS 17+, ZETIC Melange (Swift package ZeticMLange from ZeticMLangeiOS), local JSON persistence in Application Support.

AI pipeline (on-device):

  1. Heuristic safety filter — cheap checks before any model.
  2. PromptGuardServiceZeticMLangeModel for symptom_input_processing / guard-style classification.
  3. TriageLLMServiceZeticMLangeLLMModel for the main single-pass triage reply; I stop generation when a complete MEDIMATCH_JSON block is present and cap worst-case length with a max output token budget in AppConfig.
  4. Parse + second guard pass on output, then persist a bounded history.

App structure: Tab-based UI—Triage, Clinics (MapKit; network for search only), Medications (local notifications), History, Settings (privacy dashboard, accessibility, model status). Localization (en / es / fr) and accessibility (Dynamic Type, contrast, VoiceOver labels, on-device speech where available).

Delivery: Xcode project on Mac; GitHub Actions can produce an unsigned IPA for install via tools like Sideloadly for teammates without a Mac.

Challenges I ran into

  • Balancing helpfulness with safety and length. Early versions let the model stream long, repetitive text. I tightened prompting, added early termination after valid JSON, post-processing to compact repeated prose, and changed the UI to show a spinner until the final structured card—so users see one clean result, not a half-typed wall of text.
  • Prompt guard tokenization. The shipped placeholder tokenizer is byte-level; a full SentencePiece match may be required for some llama_prompt_guard exports—I documented that as a known limitation.
  • “On-device” vs “network at all.” The app does not send symptoms to a custom backend for inference, but the ZETIC SDK may download model artifacts, and MapKit needs connectivity for Clinics—I had to describe that split clearly in the product and in docs.
  • iOS build and distribution: real-device inference, signing, and side loading for non–Mac users added workflow overhead; I captured install paths in the README.

Accomplishments that I'm proud of

  • End-to-end on-device triage with a real orchestrator (heuristics → guard → LLM → parse → history), not a throwaway “call an API” stub.
  • A prompt contract the UI can trustMEDIMATCH_JSON plus clear display for severity, next steps, and candidate lines with confidence—so the experience stays one screen and one final card.
  • Honest privacy story: local persistence for history and medications, and copy that matches what the code does (with an explicit line between triage and Clinics networking).
  • Native polish where it costs effort: Dynamic Type and VoiceOver-conscious layouts, localization, medication form keyboard handling, and a CI path to build an IPA for people without Xcode.

What I learned

  • On-device ML is a product design problem, not only a model problem. Latency, memory, and download size on a phone force you to think in token budgets, early stopping, and structured output—not endless streaming prose.
  • ZETIC Melange made it practical to run both a small classifier (llama_prompt_guard) and a larger generative model (gemma-3n-E2B-it) behind one SDK and one keying model, with clear cleanup contracts for ZeticMLangeModel and ZeticMLangeLLMModel.
  • Prompting is the main quality lever when you cannot call a server “for a smarter answer.” I learned to ask the model for one readable block plus a machine-parseable payload (our MEDIMATCH_JSON schema) so the UI can show severity, next steps, and candidate explanations with explicit confidences.
  • Swift concurrency matters: inference must not run on the main thread; streaming and cancellation need to propagate cleanly from the UI through to the token loop.

I also reinforced a ground rule: MediMatch is informational triage, not a diagnosis or a medical device—the product and copy have to say that clearly.

What's next for MediMatch

  • Harden the guard model path by aligning the tokenizer with the exact llama_prompt_guard export (SentencePiece or vendor docs), so classification quality matches the rest of the stack.
  • Deeper quality passes on triage: more eval-style scenarios, edge cases (vague input, multiple symptoms), and regression checks when the prompt or model version changes.
  • Accessibility and language coverage as I learn from users—more polish for VoiceOver paths and additional locales if demand appears.
  • Product clarity only: clearer offline states after models are cached, and continued separation between on-device triage and Clinics (networked map search)—without turning the app into a “chat with a server” product.

I would treat any clinical or regulated direction as a different product with proper evidence and review—MediMatch stays in the informational lane unless that scope is explicitly and responsibly expanded.

Built With

Share this project:

Updates