Inspiration

In the US alone, more than 600,000 people are reported missing every year. 85% of lost people are found within the first 12 hours, and 97% within the first 24. After that, the survival rate drops from 90% to about 50% by hour 48, and falls below 10% past 72 hours.

Despite this, SAR operations are miraculously archaic. A human Incident Commander uses a paper map, a radio, and pencils. They have to mentally hold the positions of 20+ volunteers, the terrain, the wind, the subject's last known point, and pings. Then they dispatch by voice. This is a 1980s workflow.

This causes huge cognitive load on the commander, who is doing bookkeeping and geometry when they should be focusing on strategy. Dispatching is exactly the kind of task an AI agent is finally ready to do well. It is spatial, low stakes per decision (you can always re-dispatch), and easy enough to verify.

What it does

OpenSAR is an autonomous AI incident commander for ground search and rescue.

A commander seeds a mission by setting a search area over the subject's last known point. The server pulls in terrain, trails, water, and buildings, slices the area into a hex grid, computes initial probability of area, and goes live. Volunteers join from an iOS app with a 6-character code and start searching. From that moment:

  • Their phones stream GPS to the cloud every 3 seconds.
  • The hex cell underneath them flips to "searched" and shows up on every other volunteer's map within one poll cycle.
  • An OpenClaw agent running NVIDIA Nemotron watches the live grid and issues dispatches. Each dispatch has a specific segment, a sweep type, an instruction, and written reasoning.
  • The volunteer sees the segment on their map with a snap-to-trail route, taps Acknowledge, Start sweep, Mark complete, and the agent re-plans on findings.

How we built it

We split the build into three layers and worked them in parallel.

The database is the source of truth. Everything (coverage state, positions, dispatches, findings, hazards, terrain) lives in SQLite with SpatiaLite loaded as an extension. WAL mode, foreign keys on. The schema is small but does real geospatial work. A hex_cells table slices the mission area into roughly 30 meter hexagons with terrain stats and runtime flags. A segments table groups hexes into dispatch units with precomputed probability of area. Then we have pings, dispatches, findings, and hazards, all carrying real point or polygon geometry with spatial indexes. Migrations are idempotent and run on every FastAPI startup, so deployment is git pull and restart.

FastAPI runs the orchestration. A single process exposes the API for both the iOS app and the OpenClaw agent. POST /field/ping writes the GPS row and atomically flips the containing hex to searched (first writer wins, in a single UPDATE). GET /mission/state.geojson returns one big FeatureCollection with segments, flagged cells, every volunteer's position, 30 minute track lines, findings, hazards, and OSM features. Every phone polls it every 5 seconds. GET /field/me/route snaps dispatch routes to actual OSM trails using a SpatiaLite ClosestPoint(geom, MakePoint(...)) query. Auth is a single bearer token per user. The threat model is "did you type the right join code."

OpenClaw plus Nemotron over MCP. The agent runs inside a sandboxed OpenShell container on the DGX, talking to Nemotron (nemotron-3-super-120b-a12b) through the OpenShell inference proxy. It reaches the mission database through a custom MCP server we built with the Python MCP SDK, over streamable-http on the Docker bridge gateway. The MCP server exposes four tools: get_searcher (identity plus latest ping plus 30 minute track plus active dispatch), get_findings, recent_events (a "did anything happen?" probe), and dispatch_searcher (the only write skill, wrapped in a transaction with full pre-write validation). The agent runs as a one-shot session per decision. The database is its memory, not the context window.

The iOS app is Expo SDK 53 and React Native 0.79, signed under our Apple Developer team. Two screens: a join form and a mission view. The mission view is one big map with layered overlays. Hex cells tinted by flag. Segment outlines colored by assignee. Other volunteers as markers with 30 minute tracks. Finding pins. Hazard polygons. The active dispatch's snap-to-trail route. A bottom card runs the dispatch state machine. The app talks to FastAPI over HTTPS through an ngrok tunnel cached in expo-secure-store.

Challenges we ran into

The agent kept clustering. Our first design was one prompt: here are 5 volunteers, dispatch all of them. About 30% of the time, the model would pull a far-flung volunteer back toward the cluster instead of spreading people out. We tested it head to head with per-volunteer prompts. Same model, same scaffolding, one inference per volunteer. The failure went away. The architecture is now per-volunteer by design.

Plan and action would not line up. When we gave the model long strategic preambles, it would write beautifully about sending Alice east and then dispatch her north. We flipped the structure. Decide first, justify second. The dispatch tool call comes first, the reasoning field gets filled in after.

The frontier as suggestion, not constraint. The model sometimes invented coordinates that looked plausible but did not exist in the grid. We hardened this on the server side. Every dispatch is post-validated against the database at tool call time, before any state changes.

Edge versus cloud. We initially scoped the demo for fully on-device inference. After we got it working end to end, we realized something obvious in hindsight. Real SAR coordinators often run incident command from a fire station 30 miles from the search. Cloud Nemotron is not a workaround. It is the realistic deployment model. We kept the on-device path documented as a migration target.

Accomplishments that we're proud of

  • Multiple people showing up live in the app, with coverage state lighting up across every map within 5 seconds. Walk into a hex, every other phone in the mission shades it green.

  • Real, sound dispatch decisions made fully autonomously by the agent. Nemotron picks the segment, writes the instruction, justifies the reasoning, and dispatch_searcher mutates the database. The volunteer's phone shows the new dispatch on the next poll cycle. End to end, no human in the loop.

  • The agent cannot hallucinate a search area. Every segment ID it dispatches into was generated by deterministic Python from the mission polygon and verified at tool call time.

What we learned

The biggest lesson: scaffolding matters more than model size when you put an LLM in a control loop. Nemotron is large and capable, but the moment we asked it to do spatial reasoning without deterministic Python computing the frontier, ranked regions, and per-volunteer distances first, it became unreliable.

With the scaffolding, the same model became reliable enough to dispatch real people. The agent's job is not to compute geometry. It is to look at pre-computed geometry and make judgment calls.

Second: the database is the agent's memory. We do not pass conversation history. We pass the current state of the mission database. Memory lives in SQLite where we can audit it, not in a context window where we cannot.

What's next for OpenSAR (Search & Rescue)

Right now everything runs locally like the database, the agent and the tunnel. And the agent reasons about volunteers one at a time. That works for a 2 to 3 person campus demo, but real SAR operations have 20+ volunteers spread across square miles. Good dispatching requires the agent to hold a team-level picture of who is where, who is idle, and where the gaps are. Getting the agent to coordinate across volunteers without falling back into the clustering failure mode is the next hard problem.

Beyond that, OpenSAR is not deployable yet outside of a tech demo. It needs a proper commander UI (right now mission creation is a curl command). It needs a real deployment instead of an ngrok URL. And it needs to seed missions in arbitrary regions with terrain we have not pre-fetched. The architecture is ready for it, the work is wrapping it in a better UX a SAR unit could actually use.

Built With

Share this project:

Updates