Inspiration

Air quality monitoring systems typically rely on reactive alerts, notifying operators only after pollution levels have already crossed unsafe thresholds. We set out to design a more proactive system—one where real-time streaming data, predictive modeling, and AI-driven insights work together to anticipate risks early and enable timely intervention, rather than simply report incidents after they occur.


What it does

HSE Sentinel is a real-time industrial air quality intelligence system that continuously ingests CO₂ and PM2.5 telemetry from IoT sensors and transforms it into predictive safety insights.

The platform:

  • Streams high-frequency sensor data using Confluent Cloud (Kafka)

  • Aggregates and stabilizes noisy signals using Flink SQL windowing

  • Predicts near-future pollution severity using Google Vertex AI

  • Generates context-aware HSE recommendations using Gemini 2.0

  • Delivers live metrics, forecasts, alerts, and expert guidance through a real-time dashboard

Instead of reacting after thresholds are breached, HSE Sentinel helps operators anticipate hazards and act early.


How we built it

The system was built as an end-to-end streaming and AI pipeline:

Data Ingestion

A Python-based industrial IoT simulator produces high-frequency CO₂ and PM2.5 readings and streams them into Confluent Cloud using Kafka.

Stream Processing

Flink SQL performs 30-second tumbling window aggregations to smooth sensor noise, compute trends, and prepare stable inputs for AI inference.

Predictive AI

A time-series forecasting model hosted on Vertex AI predicts pollutant severity five minutes into the future.

Generative AI

Gemini 2.0 Flash-Lite analyzes live values, forecasts, and trends to generate real-time, human-readable HSE safety recommendations.

Backend & Delivery

A FastAPI (ASGI) backend consumes Kafka streams asynchronously and pushes live updates to the frontend via WebSockets.

Frontend

A React-based dashboard visualizes real-time data, forecasts, alerts, and AI-generated recommendations with sub-second latency.

The entire stack is containerized using Docker and Nginx for reliability and portability.


Challenges we ran into

  • Designing a streaming pipeline that balances high-frequency ingestion with stable, noise-resistant analytics

  • Synchronizing Kafka, Flink, AI inference, and WebSocket delivery without introducing latency

  • Ensuring AI recommendations felt context-aware, not rule-based or static


Accomplishments that we're proud of

  • Building a fully integrated streaming + AI system

  • Successfully combining Kafka, Flink, Vertex AI, and Gemini 2.0 into one pipeline

  • Delivering predictive insights

  • Designing a clean separation between data ingestion, processing, inference, and visualization


What we learned

  • How to design event-driven, real-time architectures using Kafka and Flink

  • The importance of windowed aggregation before AI inference in streaming systems

  • How generative AI can enhance observability systems by providing actionable context

  • Best practices for building AI systems that are modular, secure, and production-ready

  • How frontend UX strongly influences the perceived intelligence of AI-driven systems


What's next for HSE Sentinel: Predictive Industrial IoT & AI Safety Pipeline

  • Support for multiple sensors, zones, and industrial sites

  • Longer-horizon forecasting and anomaly detection

  • Confidence scoring and explainability for AI recommendations

  • Integration with automated control systems and alert escalation workflows

  • Deployment on managed cloud infrastructure for continuous operation

Built With

Share this project:

Updates