Inspiration

Water is our most valuable resource, yet the way we monitor its safety is stuck in the past. Official environmental sensors (like those from the USGS) provide incredibly accurate telemetry, but they lack human context. A sensor can tell you the pH is 7.5, but it can't tell you the water is choked with a toxic green algae bloom or industrial runoff. We realized that true water safety requires a fusion of rigorous numerical data and local, human-driven crowd-sourcing.

We wanted to build a platform that empowers communities to protect their local waterways by creating a bridge between raw data and human experience.

What it does

DeepBlue is a real-time water safety intelligence platform that acts as the "Waze for waterways."

It continuously ingests live sensor data (pH, temperature, turbidity) from 180+ monitoring stations across New York. When community members spot something concerning, such as discoloration, debris, or algae, they can upload a photo directly to the app.

Our multimodal AI instantly analyzes the image to identify visible hazards, dynamically fusing that qualitative visual risk with rigorous sensor data to update the station's safety rating in real-time. Users can then click (+) AI Advisory to receive a plain-English safety recommendation grounded in official EPA and WHO guidelines.

If a station is rated "Dangerous" or "Moderate", users can also tap Nearest Safe, and the map instantly pans to the closest currently-safe station, calculated seamlessly using MongoDB's native geospatial engine.

How we built it

We built a highly reactive, event-driven architecture built entirely on top of MongoDB Atlas and AWS Bedrock.

Backend : A Golang microservice with a background metronome that periodically pulls thousands of live telemetry points from the USGS API. It upserts them into MongoDB Atlas using bulk write operations, keeping 180+ station records perfectly current.

The Reactive Core (MongoDB Change Streams): When a user uploads a community photo, the image lands in AWS S3 and its metadata is written to our community_reports collection. We use a MongoDB Change Stream as a persistent background watcher that fires the moment a new document is inserted: no polling, no cron job. This is the heartbeat of our entire AI pipeline.

Multimodal AI Pipeline : The Change Stream triggers an asynchronous call to Claude 3.5 Sonnet via AWS Bedrock. Claude analyzes the image, extracts hazard tags (e.g., "algae", "turbid"), and computes a visual risk score that is immediately written back to the station's document in Atlas, automatically re-rendering its safety label on the live map.

RAG & Atlas Vector Search: For our AI advisories, we chunked and embedded official EPA and WHO water safety guidelines, storing the vector embeddings directly in Atlas. When a user requests an advisory, we use MongoDB Atlas Vector Search ($vectorSearch) to perform a semantic similarity search across thousands of pages of health guidelines. Claude then synthesizes the sensor data, the community photo, and the vector-searched guidelines into a medically-grounded public advisory.

Geospatial (MongoDB $near) : Our "Nearest Safe Station" feature is powered entirely by MongoDB's native geospatial engine. We maintain a 2dsphere index on every station's GeoJSON location field. When a user clicks, a single MongoDB FindOne with a $near filter returns the closest safe station in milliseconds, with zero application-level distance math required.

Frontend: A custom Vanilla JS/HTML UI with Leaflet maps, featuring a floating action button, custom blurred modal flows, and real-time polling for a seamless, app-like experience.

Challenges we ran into

One of the hardest parts of this project was figuring out how to elegantly blend slow, highly-accurate numerical data (sensors) with fast, unpredictable qualitative data (human photos).

Architecturally, we struggled initially with keeping the UI snappy while running complex, multimodal AI analysis. Waiting for S3 uploads, Bedrock API calls, and database writes to execute synchronously was causing poor UX and timeouts. We solved this by completely decoupling the process, using MongoDB Change Streams to offload AI analysis to an asynchronous background worker, while implementing a real-time polling mechanism on the frontend to deliver a smooth "Processing → Success" flow.

Accomplishments that we're proud of

We're incredibly proud of orchestrating a complex, multi-model AI pipeline entirely in Golang: a language not typically associated with AI-heavy hacking.

More than anything, we're proud of how much heavy lifting MongoDB Atlas did for us. Change Streams gave us a reactive backbone without requiring complex message queue infrastructure. Atlas Vector Search let us build a production-grade RAG pipeline with zero additional services. And the native $near geospatial operator turned what would have been a complex distance-sorting algorithm into a single, elegant database query.

Fusing sensor telemetry, unstructured image data via Claude Vision, dense vector embeddings, and geospatial proximity into a single cohesive user experience, all backed by exactly one database cluster, is what we're most proud of.

What we learned

We learned the immense power of treating MongoDB as more than just a document store. By leaning into Change Streams, we triggered complex AI pipelines directly from database events without cluttering our REST handlers. Atlas Vector Search showed us that semantic retrieval and document storage can (and should) live in the same system. And the $near geospatial operator reminded us that the best solution is often already built into your database.

We also deepened our understanding of RAG implementations: giving an LLM raw numerical data isn't enough. Combining sensor readings with authoritative, vector-searched grounding documents is what prevents hallucinations and produces AI advisories that are actually safe to act upon.

What's next for DeepBlue

Taking it National: Our architecture is fully horizontal. By tapping into the broader USGS and EPA networks, we plan to scale DeepBlue across all 50 states.

Geo-Awareness & Data Integrity : We plan to add strict geofencing using MongoDB's $geoWithin operator, verifying a user is physically near a station before allowing a photo upload to ensure all community data is rigorously location-verified and tamper-resistant.

Station Media Galleries: A public, historical timeline of photos for every station, letting users visually track how their local river or lake changes over months and years (stored as native time-series documents in Atlas).

Proactive Alerts: Push notifications via SMS so local residents are instantly alerted when a waterway they follow drops to "Dangerous": triggered, naturally, by a MongoDB Change Stream.

Built With

Share this project:

Updates