Inspiration
The California Current is one of the most productive and most stressed marine ecosystems on Earth. Scripps Institution of Oceanography has been measuring its chemistry hourly since 2010 through the CCE mooring network: pH, dissolved oxygen, aragonite saturation, nitrate, chlorophyll. Meanwhile CalCOFI has counted fish larvae from research vessels since the 1950s. iNaturalist has thousands of citizen scientists logging anchovy, sardine, and blue whale sightings in near real-time.
Nobody had connected all three.
The gap that drove us: ocean acidification is measured hourly, but the decisions that depend on it - research priorities, habitat assessments, conservation responses - often wait months for the next survey cruise. We wanted to close that gap.
What it does
MooringMind is a multi-page Streamlit dashboard that turns 15 years of raw mooring CSV data into actionable habitat intelligence:
Mission Control - Map pins for CCE reference sites, six headline sensors compared against their 30-day trailing mean, aragonite saturation readout, and a multi-sensor co-movement chart that rescales pH, oxygen, nitrate, salinity, temperature, and chlorophyll onto one axis so you see when stressors move together versus when they split.
Analytics - Physical (T-S diagrams, salinity timeseries), biological (chlorophyll and nitrate), acidification (pH total scale, 15-year trend), and extremes (seasonal radar showing which variables peak in which season).
AI Predictions - Three ML tools in one tab: a short-term sensor forecast, a multivariate anomaly detector that scores every timestamp across sensors simultaneously and flags the 2% of moments that are genuinely unusual, and a sensor reconstruction tool that estimates what a failed sensor would have read using its neighbors - trained on the older 80% of the record, validated on data the model never saw.
Analysis Lab - STL seasonal decomposition, rolling statistics, and anomaly flags on daily residuals for researchers who want to go deeper than the dashboard summaries.
Data Quality - A coverage heatmap showing exactly which sensors have gaps, when, and how much - so you know which months to trust before running analysis.
Species Validation - The ecological bridge: mooring anomaly dates cross-referenced against iNaturalist research-grade observations. After a dissolved oxygen breach event, you can see whether anchovy, sardine, Humboldt squid, or blue whale sightings shifted in the following two weeks - validating that the physics actually reached the biology.
CalCOFI Ecosystem - Decades of ship-based larval fish and zooplankton surveys overlaid with the mooring record, so the high-resolution sensor data and the long historical biological record can be read together.
How we built it
Stack: Python · Streamlit · Plotly · scikit-learn · statsmodels · pandas · pydeck · iNaturalist API
The core data pipeline processes the CCE mooring master CSV into a daily panel, interpolates for rolling statistics, and runs STL decomposition on each sensor column. The anomaly detector uses Isolation Forest across four sensors simultaneously, flagging moments where the combination is unusual, not just individual outliers. The sensor reconstruction is a Random Forest trained on lagged cross-sensor features; the feature importance chart shows which neighboring sensors contributed most to the prediction.
Dynamic insight generation runs on every page load. Functions like insight_headline_metrics(), insight_mooring_window(), and key_findings_mission() auto-write the captions from the data so MooringMind stays current as new CSV data comes in. The aragonite habitat readout is derived from overlapping pH, temperature, and salinity timestamps via PyCO2SYS logic.
The app is modular by design: swap in any CCE mooring slice via sidebar upload, toggle between CCE1 and CCE2, and adjust the time window globally across all pages at once.
Challenges we ran into
Data sparsity is uneven. Temperature and salinity sensors dropped offline in 2014 - 29% fill rate. Building analysis tools that degrade gracefully when columns are missing, rather than crashing, required defensive logic throughout. The coverage heatmap exists specifically because we kept forgetting which months were trustworthy.
Aligning three datasets with different time resolutions. Mooring data is hourly. CalCOFI cruises are quarterly. iNaturalist sightings are event-driven. The inner join for the species validation overlay - finding months where both mooring pH and CalCOFI larvae data exist - reduced a 60-year record to 68 overlapping months. That's the honest number, and we show it.
Making AI outputs legible to non-ML users. "Isolation Forest flagged 103 anomalies" means nothing without context. We built a plain-language explanation layer - "pH was extremely lower than usual in 46 of the 103 flagged moments - acidic water often comes from respiration, decomposition, or upwelled $\text{CO}_2$-rich deep water" - so the output is interpretable without a machine learning background.
Aragonite saturation estimation. Full $\Omega_\text{aragonite}$ requires co-located pH, temperature, salinity, and alkalinity on the same timestamps. The CCE2 record has gaps in all four. MooringMind honestly reports when the estimate cannot be made rather than showing a misleading number.
Accomplishments that we're proud of
The sensor reconstruction hit 98% variance explained and outperformed a naive baseline by +95% on data the model had never seen. That's not a toy result; it means the sensors are genuinely correlated enough that losing one doesn't mean losing the information.
The species validation page produces a testable hypothesis: mooring anomaly on date X, species sightings shift in the following 14 days. Three of the 2022 dissolved oxygen breach events showed 20-140% increases in nearby sightings afterward. That's a signal worth following.
The auto-generated insight layer means MooringMind narrates itself - every page writes its own captions from the live data, so it's useful with a fresh CSV without any manual annotation.
What we learned
Carbonate chemistry is harder to compute honestly than it looks. The number of co-located, quality-controlled timestamps for full $\Omega_\text{aragonite}$ estimation is much smaller than the raw record length suggests - and showing that honestly is more valuable than papering over it.
Citizen science data (iNaturalist) is noisier than sensor data but faster. The mooring measures the stress; iNaturalist measures the response. Together they tell a more complete story than either does alone.
The most important design decision we made was the multi-sensor co-movement chart on Mission Control - normalizing everything to $[0, 1]$ so you compare timing, not chemistry. That single view surfaces pH-oxygen coupling ($r = 0.92$ on overlapping points) instantly, without any statistics background required.
What's next for MooringMind - Ocean Mooring Insights, Instantly
Live data ingestion. Connect directly to the Scripps/MBARI data feeds so Mission Control updates automatically rather than requiring a CSV upload.
Full carbonate system solver. Integrate PyCO2SYS completely - when alkalinity estimates are available, compute $\Omega_\text{aragonite}$, $p\text{CO}_2$, and DIC alongside the existing pH record.
Expanded species layer. Add OBIS and GBIF occurrence data to the species validation page so the biological cross-reference isn't limited to iNaturalist observer density.
Alert subscriptions. Email or Slack notifications when a mooring crosses a user-defined pH or oxygen threshold - so researchers don't have to check the dashboard to know something changed.
Multi-mooring comparison. MooringMind currently supports CCE1 and CCE2. Extending to MBARI M1/M2 and the broader OOI array would make it a California Current-wide monitoring tool rather than a single-site dashboard.
Publish the pipeline as a Python package so other mooring programs globally can run the same anomaly detection and species validation stack on their own data.
Built With
- data-analytics
- inaturalistapi
- machine-learning
- pandas
- plotly
- pydeck
- python
- scikit-learn
- statsmodels
- streamlit
Log in or sign up for Devpost to join the conversation.