Inspiration

Unplanned engine failures cost the aviation industry over $150 billion annually. Current maintenance approaches are reactive—fix it when it breaks. With modern sensors generating thousands of data points, we saw an opportunity to flip this model: predict failure risk in real-time and intervene proactively.

What it does

Catalyst is a real-time predictive maintenance system for turbofan engines:

  1. Streams sensor data — 21 sensors measuring temperature, pressure, and speed flow through Confluent Kafka
  2. Computes features — Rolling statistics detect degradation patterns in real-time
  3. Predicts risk — LSTM neural network scores failure risk from 0-100%
  4. Enables intervention — Operators can simulate maintenance actions and see cycles saved
  5. Explains alerts — Gemini AI generates natural language explanations

How we built it

Streaming Layer:

  • Confluent Cloud with 3 Kafka topics (sensor-events, feature-events, risk-events)
  • WebSocket server for real-time dashboard updates

ML Pipeline:

  • LSTM neural network trained on NASA C-MAPSS FD001 dataset (100 engines, 21 sensors)
  • Multi-task learning: predicts both risk score and remaining useful life
  • Physics-informed features: rolling variance, trend slopes, cross-sensor correlations

Cloud Infrastructure:

  • Google Cloud Run for serverless backend
  • Vertex AI for ML model serving with auto-scaling
  • Gemini 2.0 Flash for AI-powered alert explanations

Frontend:

  • React 18 with real-time WebSocket updates
  • Recharts for live sensor visualization
  • Fleet and single-engine views with maintenance simulation

Challenges we ran into

  • Progressive windowing: Features need to work from cycle 2, not just after a full 30-cycle window
  • Multi-sensor fusion: Combining 14 useful sensors into meaningful health indicators
  • Real-time constraints: Keeping prediction latency under 100ms while computing rolling stats
  • Intervention modeling: Simulating how maintenance actions affect degradation trajectories

Accomplishments we're proud of

  • 94% failure detection rate with 25-cycle average lead time
  • End-to-end streaming pipeline from sensor to dashboard in under 200ms
  • Maintenance simulation that shows tangible cycles saved
  • Clean architecture with clear separation between streaming, ML, and presentation layers

What we learned

  • Confluent Kafka's exactly-once semantics are crucial for ML predictions
  • Google Cloud Run handles WebSocket connections surprisingly well
  • Physics-informed features (variance, slopes) often outperform raw sensor values
  • Real-time ML is more about feature engineering than model complexity

What's next for Catalyst

  • Multi-engine correlation: Detect fleet-wide issues affecting multiple engines
  • Feedback loop: Track actual failures vs predictions to retrain models
  • Mobile alerts: Push notifications for critical risk thresholds
  • Schema Registry: Enforce data contracts across the streaming pipeline

Built With

Share this project:

Updates