OceanGuard AI

About the Project

OceanGuard AI is a satellite-AI platform that helps detect possible dark-vessel risk near protected marine areas.

The system combines Sentinel-1 SAR imagery, a trained YOLO vessel-detection model, AIS evidence comparison, marine protected-area checks, risk scoring, and AI-generated evidence cards to support faster human review.

OceanGuard does not automatically accuse vessels. It is designed as a decision-support system that turns complex satellite and maritime signals into clear, review-ready evidence for analysts, conservation teams, and patrol officers.

Inspiration

We were inspired by a serious ocean-monitoring problem: the ocean is too large to watch manually, and some vessels can become invisible to public tracking systems when AIS is switched off.

At the same time, marine protected areas need faster review workflows because suspicious activity can disappear before humans finish checking the data.

This project connects strongly with SDG 14: Life Below Water, because protecting marine resources requires better visibility, faster triage, and responsible use of technology.

The Challenge

OceanGuard focuses on three main problems:

  • Ships go dark: Vessels may switch off AIS, making them harder to track using public vessel data.
  • Too many contacts to check: Satellite systems can produce many detections, but human teams cannot inspect every contact manually.
  • Protected areas go unwatched: Marine reserves need faster prioritization so risky detections are reviewed before the vessel is gone.

What We Built

We built a working OceanGuard AI prototype with:

  • A YOLO-based vessel detector trained for SAR vessel-like object detection.
  • A Sentinel-1 SAR pipeline for satellite-based vessel detection.
  • An AIS comparison layer to identify possible AIS-off tracking gaps.
  • A marine protected-area risk layer to check whether detections are inside or near protected waters.
  • A risk scoring engine to prioritize detections.
  • A FastAPI backend with agent endpoints for briefing, patrol recommendations, narration, and Q&A.
  • A web dashboard that shows detections, risk levels, evidence cards, and analyst review actions.
  • A GCP deployment setup using Cloud Run, Secret Manager, and GitHub Actions.

A simplified risk score can be represented as:

$$ Risk = w_1(SAR\ Confidence) + w_2(AIS\ Gap) + w_3(MPA\ Proximity) + w_4(Activity\ Context) $$

This helps OceanGuard rank detections so human reviewers can focus on the most important cases first.

How It Works

Sentinel-1 SAR imagery
        ↓
SAR preprocessing
        ↓
YOLO vessel detector
        ↓
Detection coordinates
        ↓
AIS evidence comparison
        ↓
Protected-area check
        ↓
Risk scoring
        ↓
AI evidence writer
        ↓
Evidence card
        ↓
Human analyst review

Built With

Share this project:

Updates

posted an update

Training milestone: our SAR vessel detector is live, and here's the honest story behind it

We just wrapped the core detection pipeline for OceanGuard AI, and we want to be transparent about what "trained" actually means here — both what we proved and what's genuinely next.

What we trained on (and why it's a deliberately small, focused set)

We did not throw a massive dataset at this. We trained YOLO11n on HRSID — 5,604 high-resolution SAR images containing 16,951 labeled ships (2,857 train / 715 validation split). That's a focused, high-quality dataset, not a broad one.

Model: YOLO11n Epochs: 50 Image size: 640x640 Batch: 16 Hardware: Tesla T4 GPU Time: 1.69 hours

Results:

Metric Score
mAP50 0.838
mAP50-95 0.579
Precision 0.830
Recall 0.818

We're being upfront: this is a small, single-source training set. It's enough to prove the concept works — that a model can learn to find ship-shaped bright returns in SAR backscatter — but it is not yet trained on the scale or diversity of imagery a production maritime surveillance system would need.

Proving it isn't just overfit to its own dataset

The number that actually matters more to us than mAP50 is this: we took the trained model and ran it on a scene it had never seen — a real Sentinel-1 SAR pass from xView3 (28,676 x 24,522 px, way bigger than anything in training). We tiled it into 1,174 usable 640x640 chips and ran inference.

Result: 122 real vessel detections, confidences up to 0.76, fully georeferenced from raw pixels to lat/lon using rasterio + pyproj. That scene happened to be over the Gulf of Guinea, not our target region — and we're keeping that honest rather than pretending otherwise. It's there specifically as a generalization check, separate from our actual conservation use case.

Where "live data" comes in — and where it doesn't yet

This is the part we want to be most careful about. Two different things both get called "live," and they're not the same:

Already real and current: Our risk-scoring side runs on real, queried-on-demand data for Bar Reef Marine Sanctuary, Sri Lanka — actual unmatched AIS/SAR detections from the Global Fishing Watch API, the official WDPA marine-protected-area boundary (via Google Earth Engine), and real port locations from OpenStreetMap. One real detection sits just 0.4km from the sanctuary boundary. This data is not synthetic.

Not yet live, and we're saying so plainly: there is no continuous live SAR feed. Sentinel-1 — the actual satellite source — revisits any given location roughly every 6-12 days, not continuously. So "live vessel detection" from SAR is inherently a periodic, scheduled-batch problem, not a streaming one. Any system (ours included) that implies otherwise is overstating what satellite radar can physically do today.

What's actually next on the training side

  • More data, deliberately: cross-training on xView3's full label set (not just using it for validation) to test whether SAR-domain-specific training improves on HRSID alone
  • Real ground-truth evaluation: matching our 122 detections against xView3's actual validation labels to report real precision/recall/localization error on a held-out real scene — not just training-set mAP
  • Scheduled ingestion: turning the "run inference on the latest pass" step into an automated job that fires whenever a new Sentinel-1 scene over a protected area becomes available — closing the gap between "periodic" and "as fresh as physically possible"

We'd rather ship a small, honestly-scoped model with a clear roadmap than overclaim a bigger one. More updates as the risk-scoring + agent layer come together.

Log in or sign up for Devpost to join the conversation.