posted an update

Training milestone: our SAR vessel detector is live, and here's the honest story behind it

We just wrapped the core detection pipeline for OceanGuard AI, and we want to be transparent about what "trained" actually means here — both what we proved and what's genuinely next.

What we trained on (and why it's a deliberately small, focused set)

We did not throw a massive dataset at this. We trained YOLO11n on HRSID — 5,604 high-resolution SAR images containing 16,951 labeled ships (2,857 train / 715 validation split). That's a focused, high-quality dataset, not a broad one.

Model: YOLO11n Epochs: 50 Image size: 640x640 Batch: 16 Hardware: Tesla T4 GPU Time: 1.69 hours

Results:

Metric Score
mAP50 0.838
mAP50-95 0.579
Precision 0.830
Recall 0.818

We're being upfront: this is a small, single-source training set. It's enough to prove the concept works — that a model can learn to find ship-shaped bright returns in SAR backscatter — but it is not yet trained on the scale or diversity of imagery a production maritime surveillance system would need.

Proving it isn't just overfit to its own dataset

The number that actually matters more to us than mAP50 is this: we took the trained model and ran it on a scene it had never seen — a real Sentinel-1 SAR pass from xView3 (28,676 x 24,522 px, way bigger than anything in training). We tiled it into 1,174 usable 640x640 chips and ran inference.

Result: 122 real vessel detections, confidences up to 0.76, fully georeferenced from raw pixels to lat/lon using rasterio + pyproj. That scene happened to be over the Gulf of Guinea, not our target region — and we're keeping that honest rather than pretending otherwise. It's there specifically as a generalization check, separate from our actual conservation use case.

Where "live data" comes in — and where it doesn't yet

This is the part we want to be most careful about. Two different things both get called "live," and they're not the same:

Already real and current: Our risk-scoring side runs on real, queried-on-demand data for Bar Reef Marine Sanctuary, Sri Lanka — actual unmatched AIS/SAR detections from the Global Fishing Watch API, the official WDPA marine-protected-area boundary (via Google Earth Engine), and real port locations from OpenStreetMap. One real detection sits just 0.4km from the sanctuary boundary. This data is not synthetic.

Not yet live, and we're saying so plainly: there is no continuous live SAR feed. Sentinel-1 — the actual satellite source — revisits any given location roughly every 6-12 days, not continuously. So "live vessel detection" from SAR is inherently a periodic, scheduled-batch problem, not a streaming one. Any system (ours included) that implies otherwise is overstating what satellite radar can physically do today.

What's actually next on the training side

  • More data, deliberately: cross-training on xView3's full label set (not just using it for validation) to test whether SAR-domain-specific training improves on HRSID alone
  • Real ground-truth evaluation: matching our 122 detections against xView3's actual validation labels to report real precision/recall/localization error on a held-out real scene — not just training-set mAP
  • Scheduled ingestion: turning the "run inference on the latest pass" step into an automated job that fires whenever a new Sentinel-1 scene over a protected area becomes available — closing the gap between "periodic" and "as fresh as physically possible"

We'd rather ship a small, honestly-scoped model with a clear roadmap than overclaim a bigger one. More updates as the risk-scoring + agent layer come together.

Log in or sign up for Devpost to join the conversation.