You can also feed live data. Right now only 50 observations but can be scaled up.
According to the data, a anomaly score is generated and each sensor readings (graph) can be then use by the AI to look for probable cause
All the findings that are generated can be reported and probable cause is also generated and added to the pdf report
main dashboard, you can select the Engine No., Sensor no. accordingly telemetry data is displayed
the report is saved as pdf

✈️ AI-Powered Aviation Anomaly Detector

Hybrid Deep Learning + Generative AI Diagnostic System

🧩 Abstract

This project implements an AI-driven aviation engine diagnostic platform that integrates a sequence-based anomaly detection model (LSTM Autoencoder) with a Large Language Model (LLM) for natural-language diagnosis generation.

The system analyzes multi-sensor turbofan engine data (NASA C-MAPSS datasets) to:

Detect operational anomalies,
Identify deviating sensors,
Generate expert-level technical reports via Gemini (Google Generative Language API),
Produce fully formatted PDF diagnostic reports.

The application is deployed via Streamlit, enabling interactive data exploration and real-time anomaly assessment for both historical and live sensor feeds.

⚙️ System Architecture

        ┌────────────────────────────────────────────┐
        │             Streamlit Frontend              │
        │     (Interactive UI + Visualization)       │
        └────────────────────────────────────────────┘
                         │
                         ▼
        ┌────────────────────────────────────────────┐
        │       Data Pipeline & Preprocessing         │
        │  - Load NASA FD001–FD004 datasets          │
        │  - Apply MinMaxScaler (global normalization)│
        └────────────────────────────────────────────┘
                         │
                         ▼
        ┌────────────────────────────────────────────┐
        │        LSTM Autoencoder (Keras)             │
        │  - Sequence length = 50 cycles              │
        │  - 21 sensor inputs                         │
        │  - Trained on normal operating data         │
        │  - Computes reconstruction error (MAE)      │
        └────────────────────────────────────────────┘
                         │
                         ▼
        ┌────────────────────────────────────────────┐
        │   Anomaly Investigation Engine              │
        │  - Identify top 3 deviating sensors         │
        │  - Pass findings to Gemini LLM              │
        │  - Generate human-readable technical report │
        └────────────────────────────────────────────┘
                         │
                         ▼
        ┌────────────────────────────────────────────┐
        │       PDF Report Generator (FPDF)           │
        │  - Summarizes results, plots, LLM analysis  │
        │  - Exports professional-grade report        │
        └────────────────────────────────────────────┘

🧠 Core Components

1. LSTM Autoencoder

Architecture:
- Encoder: 3 stacked LSTM layers compressing temporal signals
- Bottleneck latent representation
- Decoder: LSTM layers reconstructing input sequence
Objective Function:
Mean Absolute Error (MAE) between original and reconstructed sensor signals.
Anomaly Score:
MAE = (1/n) * Σ |xᵢ - x̂ᵢ|

where:

xᵢ is the observed sensor value
x̂ᵢ is the predicted sensor value
n is the number of data points

A high MAE indicates abnormal sensor behavior or a potential fault.

2. LLM-Powered Diagnostic Engine (Gemini API)

Uses Google’s generativelanguage.googleapis.com endpoint (gemini-2.5-flash-preview-05-20).
Prompt engineering integrates:
- Anomaly score
- Top deviating sensors
Returns structured text containing:
- Expert Diagnosis
- Probable Root Cause

Example Prompt:

3. Streamlit User Interface

Tab 1 – Test Data Analysis:
Load and visualize engines from NASA FD001–FD004 datasets.
Tab 2 – Live Data Analysis:
Paste or stream real-time sensor readings for anomaly detection.
Interactive plots built with Plotly and Matplotlib.
PDF export via FPDF library.

📦 Folder Structure

project_root/
│
├── app.py                     ← Streamlit app (main UI)
├── expert_rules.py
├── run_combined.py
├── run_project_fd001.py
├── run_project_fd002.py
├── run_project_fd003.py
├── run_project_fd004.py
│
├── data/
│   ├── readme.txt
│   ├── train_FD001.txt
│   ├── test_FD001.txt
│   ├── RUL_FD001.txt
│   ├── Damage Propagation Modeling.pdf
│   └── ... (same for FD002–FD004)
│
├── models/                    ← Trained models (.h5)
├── plots/                     ← Generated charts/reports
├── venv/                      ← Virtual environment
├── .env                       ← Contains secrets (DO NOT COMMIT)
└── .gitignore                 ← Controls what’s excluded from Git

🔒 Security and Privacy

Sensitive components not committed:

.env → contains GOOGLE_API_KEY
/models/ → stores trained weights
/data/ → raw NASA datasets
/plots/ → runtime outputs
/venv/ → local environment

All secrets are loaded via:

from dotenv import load_dotenv
import os

load_dotenv()
api_key = os.getenv("GOOGLE_API_KEY")

🧪 Example Workflow

Select Dataset → e.g., FD002
Choose Engine ID
Click Generate PDF Report

The App Performs:

Computes reconstruction MAE (Mean Absolute Error)
Identifies abnormal sensors
Queries Gemini API for natural-language diagnosis
Outputs a formatted PDF report with plots and LLM explanations

📊 Example Output

Anomaly Score: 0.1624
Top Deviating Sensors: s3, s9, s14

LLM Output:

Expert Diagnosis: Compressor efficiency degradation detected.
Probable Root Cause: Stage 2 turbine wear leading to increased vibration and thermal imbalance.

Generated Report:
Anomaly_Report_Engine_FD002_123.pdf

🧠 Dependencies

Library	Purpose
`tensorflow.keras`	LSTM Autoencoder architecture
`sklearn.preprocessing`	MinMax scaling
`streamlit`	Interactive web application
`plotly`, `matplotlib`	Visualization
`fpdf`	PDF generation
`requests`	Gemini API calls
`python-dotenv`	Secure environment variable management
`numpy`, `pandas`	Data manipulation