Inspiration

Water pollution from industrial discharge is one of the most preventable forms of environmental damage — yet most facilities only discover a compliance breach after it happens, when the fine has already been issued and the damage is done. We wanted to flip that: give operators a system that sees a breach coming 30 minutes before it occurs and tells them exactly what to do about it.

We also noticed that drought and groundwater stress are increasingly relevant to water treatment facilities — a plant stressed by low groundwater operates very differently to one with abundant supply. No existing tool connected real-time effluent monitoring with satellite-derived environmental context. AquaSense AI does both.

What it does

AquaSense AI is a full-stack real-time compliance intelligence platform for wastewater treatment at food processing facilities. It:

  • Streams live sensor data (COD, BOD, TSS, Ammonia, pH, Temperature, Turbidity, DO, Conductivity, ORP) via Server-Sent Events, replayed from an IoT dataset at demo speed
  • Predicts compliance breaches 30 minutes in advance using a trained ML model, with a probability score and the top contributing parameters
  • Fires automated bulletins the moment an alert is created — dashboard toast notifications, SMS, and WhatsApp messages via Twilio
  • Hosts a knowledge-based AI chat assistant (Qwen 2.5-7B via HuggingFace) that knows the live system state and answers compliance questions in plain language
  • Maps regional drought severity and groundwater stress for 24 UK cities using ERA5 satellite reanalysis data from Open-Meteo, with an interactive Leaflet map and per-city trend charts
  • Generates downloadable PDF compliance reports with full sensor history, anomaly flags, and recommended actions

How we built it

The system has three services running in parallel:

ML Service — A FastAPI app running a scikit-learn pipeline trained on historical wastewater sensor data. It produces breach probability scores, 30-minute parameter forecasts, anomaly detection flags, and a top-driver breakdown for every reading.

Backend — An Express/Node.js server that replays sensor CSV data on a 2-second tick (each row represents a 5-minute IoT interval), calls the ML service, stores predictions and alerts in SQLite, and streams everything to connected clients via SSE. Twilio SMS and WhatsApp notifications fire on a per-severity cooldown to prevent flooding.

Frontend — A Next.js 15 app with a live dashboard, risk analysis, alert management, data quality monitoring, PDF reports, an AI chat panel, and the drought/satellite page. Charts are built with Recharts, the regional map with react-leaflet, and the AI responses render as formatted markdown via react-markdown. Fully theme-aware (dark/light) including the Leaflet tile layer.

Challenges we faced

  • FastAPI/Starlette version mismatch — Starlette 1.0.0 changed middleware tuple unpacking from 2-tuple to 3-tuple, silently breaking our middleware stack. Took significant debugging across two Python environments (Homebrew vs Anaconda) before identifying the root cause and upgrading FastAPI.
  • Notification flooding — Early Twilio integration fired an SMS on every prediction tick, burning through the daily message cap in minutes. We implemented per-severity cooldowns (RED: 30s, AMBER: 60s, WATCH: 120s) and a midnight-reset daily counter.
  • Compliance limit tuning — Setting limits too conservatively put normal readings permanently into AMBER, making alerts meaningless. We had to calibrate against the actual data distribution to achieve a realistic breach rate.
  • Leaflet SSR — Next.js's server-side rendering breaks Leaflet's DOM assumptions. We solved this with dynamic() import with ssr: false and a MutationObserver-based theme hook to switch map tiles between dark and light CartoDB layers without remounting.
  • AI chat context — The HuggingFace model had no knowledge of our live system. We built a buildSystemPrompt() function that injects the latest prediction, active alerts, status distribution, and compliance limits from SQLite into every request, making responses genuinely contextual.

Behind the dashboard, we built the model in layers, because real wastewater monitoring cannot depend on one black-box prediction.

First, rules check whether the site is within its consent limits right now. Then soft sensors estimate COD, BOD, and TSS from faster signals like turbidity, UV254, flow, conductivity, and dissolved oxygen, which is backed by wastewater research using sensor fusion and soft-sensor methods.

On top of that, our time-series model predicts 30-minute breach risk, and separate forecasters predict where COD, TSS, BOD, ammonia, and pH are heading next.

Finally, an anomaly layer checks for sensor issues like spikes, flatlines, or missing readings. So the dashboard is not just showing data; it is combining rules, research-backed estimates, forecasts, risk prediction, and sensor-quality checks into one live decision-support tool

What we learned

Real-time systems surface integration bugs that unit tests never catch. The gap between "the model predicts a breach" and "an operator gets a useful, actionable notification at the right time" is where most of the engineering lives. We also learned that satellite environmental data (ERA5 via Open-Meteo) is surprisingly accessible and adds meaningful context that pure sensor dashboards miss entirely.

Built With

Share this project:

Updates