Inspiration

  • When disasters strike, responders drown in unstructured reports: social posts, 911 transcripts, volunteer notes, radio traffic. We wanted a tool that turns that chaos into concise, actionable triage—offline, fast, and trustworthy.
  • We were inspired by teams in the field who need low-latency, resilient workflows, not a fragile cloud dependency. That led us to a local-first design.

What it does

  • Extracts structured incident reports from free text into a clean JSON format (location_text, time_iso, severity, needs, notes).
  • Scores risk, provides destination suggestions from known facilities, and computes confidence from an ensemble.
  • Generates ICS-213 and SITREP summaries, plus EN/ES radio phrasing.
  • Runs locally against:
    • A base model (e.g., gpt-oss:20b via an Ollama-compatible API),
    • An ensemble for robustness and confidence,
    • Or a fine-tuned LoRA adapter model.
  • Streams tokens with a JSON parser that tolerates imperfect outputs and recovers valid JSON.

How we built it

  • Frontend: Next.js (App Router), TypeScript, Material UI (MUI), Redux Toolkit for state, streaming UI components, and facility-based suggestions.
  • Backend API routes: /api/extract, /api/extract/stream, /api/extract/ensemble, /api/ics213, /api/radio, /api/sitrep.
  • Model serving:
    • FastAPI server exposing Ollama-compatible /api/generate for base + adapter.
    • Works with standard Ollama at http://127.0.0.1:11434 or our Python server on an alternate port via OLLAMA_BASE.
  • Fine-tuning:
    • QLoRA with PEFT on a 4-bit base using TRL SFTTrainer; compact LoRA ranks for single-GPU feasibility.
    • Prompts enforce “JSON only” outputs with a schema-like instruction.
    • Data pipeline merging synthetic and curated samples.
  • Robustness features:
    • NDJSON streaming and incremental concatenation.
    • Heuristics to strip code fences, parse between tags, and recover balanced JSON.
    • Ensemble mode to reduce variance and provide confidence.
  • Evaluation:
    • Severity accuracy and needs Jaccard similarity, driven by a simple HTTP eval script.

Challenges we ran into

  • Enforcing strict JSON outputs; we built multi-step cleanup and balanced-brace recovery.
  • Running large models locally; quantization and LoRA helped, but CPU inference is tight.
  • Adapter/base compatibility; PEFT adapters are base-specific and mismatches break loading.
  • Tooling fragmentation; Ollama prefers GGUF while our FT artifacts are PEFT.
  • Streaming UX; handling partial tokens while ensuring the final JSON parses.
  • Data curation; small datasets needed augmentation and clear labeling.

Accomplishments that we’re proud of

  • Local-first triage workflow that streams results in real time.
  • JSON extraction resilient to imperfect generations, yielding valid typed outputs.
  • Clean UI with one-click ICS-213/SITREP/radio generation.
  • Ensemble mode that improves stability and provides usable confidence signals.
  • Lightweight LoRA fine-tune that improves structured extraction quality.

What we learned

  • Instruction design matters as much as fine-tuning; concise schema prompts improve JSON validity.
  • Quantization and LoRA are essential to make large models practical on constrained hardware.
  • Streaming plus tolerant parsing beats strict decoding for UX, especially with schema recovery.
  • Simple ensembles deliver valuable confidence estimates for end users.
  • Simple metrics go far; for set-like fields, Jaccard similarity is intuitive: [ J(A, B) = \frac{|A \cap B|}{|A \cup B|} ]

What’s next for TerraGuard

  • Merge-and-quantize path: merge LoRA into base and export GGUF to create a native Ollama model/tag.
  • Constrained decoding: JSON grammars or structured decoders to eliminate post-hoc cleanup.
  • Active learning loop: capture low-confidence cases for rapid relabeling and continual fine-tuning.
  • Multilingual expansion and better time normalization; safer geocoding with ambiguity handling.
  • Packaging for edge devices; containerized GPU images for field deployment.
  • RAG over facilities/critical infrastructure with freshness guarantees and provenance.
  • Expanded evaluation across disaster domains and languages.

Built With

  • api`
  • bitsandbytes`
  • fastapi`
  • gl`
  • huggingface
  • maplibre
  • next.js`
  • node.js`
  • ollama
  • ollama-compatible
  • parquet
  • peft`
  • pmtiles
  • pmtiles`
  • protomaps`
  • python`
  • pytorch`
  • react`
  • redux
  • transformers`
  • trl`
  • typescript`
  • uvicorn`
Share this project:

Updates