Inspiration
- When disasters strike, responders drown in unstructured reports: social posts, 911 transcripts, volunteer notes, radio traffic. We wanted a tool that turns that chaos into concise, actionable triage—offline, fast, and trustworthy.
- We were inspired by teams in the field who need low-latency, resilient workflows, not a fragile cloud dependency. That led us to a local-first design.
What it does
- Extracts structured incident reports from free text into a clean JSON format (location_text, time_iso, severity, needs, notes).
- Scores risk, provides destination suggestions from known facilities, and computes confidence from an ensemble.
- Generates ICS-213 and SITREP summaries, plus EN/ES radio phrasing.
- Runs locally against:
- A base model (e.g.,
gpt-oss:20bvia an Ollama-compatible API), - An ensemble for robustness and confidence,
- Or a fine-tuned LoRA adapter model.
- A base model (e.g.,
- Streams tokens with a JSON parser that tolerates imperfect outputs and recovers valid JSON.
How we built it
- Frontend: Next.js (App Router), TypeScript, Material UI (MUI), Redux Toolkit for state, streaming UI components, and facility-based suggestions.
- Backend API routes:
/api/extract,/api/extract/stream,/api/extract/ensemble,/api/ics213,/api/radio,/api/sitrep. - Model serving:
- FastAPI server exposing Ollama-compatible
/api/generatefor base + adapter. - Works with standard Ollama at
http://127.0.0.1:11434or our Python server on an alternate port viaOLLAMA_BASE.
- FastAPI server exposing Ollama-compatible
- Fine-tuning:
- QLoRA with PEFT on a 4-bit base using TRL
SFTTrainer; compact LoRA ranks for single-GPU feasibility. - Prompts enforce “JSON only” outputs with a schema-like instruction.
- Data pipeline merging synthetic and curated samples.
- QLoRA with PEFT on a 4-bit base using TRL
- Robustness features:
- NDJSON streaming and incremental concatenation.
- Heuristics to strip code fences, parse between tags, and recover balanced JSON.
- Ensemble mode to reduce variance and provide confidence.
- Evaluation:
- Severity accuracy and needs Jaccard similarity, driven by a simple HTTP eval script.
Challenges we ran into
- Enforcing strict JSON outputs; we built multi-step cleanup and balanced-brace recovery.
- Running large models locally; quantization and LoRA helped, but CPU inference is tight.
- Adapter/base compatibility; PEFT adapters are base-specific and mismatches break loading.
- Tooling fragmentation; Ollama prefers GGUF while our FT artifacts are PEFT.
- Streaming UX; handling partial tokens while ensuring the final JSON parses.
- Data curation; small datasets needed augmentation and clear labeling.
Accomplishments that we’re proud of
- Local-first triage workflow that streams results in real time.
- JSON extraction resilient to imperfect generations, yielding valid typed outputs.
- Clean UI with one-click ICS-213/SITREP/radio generation.
- Ensemble mode that improves stability and provides usable confidence signals.
- Lightweight LoRA fine-tune that improves structured extraction quality.
What we learned
- Instruction design matters as much as fine-tuning; concise schema prompts improve JSON validity.
- Quantization and LoRA are essential to make large models practical on constrained hardware.
- Streaming plus tolerant parsing beats strict decoding for UX, especially with schema recovery.
- Simple ensembles deliver valuable confidence estimates for end users.
- Simple metrics go far; for set-like fields, Jaccard similarity is intuitive: [ J(A, B) = \frac{|A \cap B|}{|A \cup B|} ]
What’s next for TerraGuard
- Merge-and-quantize path: merge LoRA into base and export GGUF to create a native Ollama model/tag.
- Constrained decoding: JSON grammars or structured decoders to eliminate post-hoc cleanup.
- Active learning loop: capture low-confidence cases for rapid relabeling and continual fine-tuning.
- Multilingual expansion and better time normalization; safer geocoding with ambiguity handling.
- Packaging for edge devices; containerized GPU images for field deployment.
- RAG over facilities/critical infrastructure with freshness guarantees and provenance.
- Expanded evaluation across disaster domains and languages.
Built With
- api`
- bitsandbytes`
- fastapi`
- gl`
- huggingface
- maplibre
- next.js`
- node.js`
- ollama
- ollama-compatible
- parquet
- peft`
- pmtiles
- pmtiles`
- protomaps`
- python`
- pytorch`
- react`
- redux
- transformers`
- trl`
- typescript`
- uvicorn`
Log in or sign up for Devpost to join the conversation.