title: Flight Delay Risk Predictor emoji: ✈️ colorFrom: indigo colorTo: blue sdk: docker app_file: app.py pinned: false license: mit

short_description: AI-powered flight delay probability estimation

✈️ Flight Schedule and Delay Prediction

AI-powered flight delay risk prediction system trained on 7M+ US domestic flights from the 2024 Bureau of Transportation Statistics (BTS) dataset.

🎯 What It Does

Enter flight details (airline, route, time, weather conditions) and get:

  • Delay probability — likelihood of ≥15 minute arrival delay
  • Risk category — Low / Medium / High
  • Operational indicators — airport congestion, rolling delay averages, peak hour status

🧠 Model Architecture

Component Details
Best Model Logistic Regression (balanced)
ROC-AUC 0.917
Accuracy 89.5%
F1 Score 0.769
Preprocessing OneHotEncoder (categorical) + StandardScaler (numeric)
Target Arrival delay ≥ 15 minutes (binary)

Feature Engineering Pipeline

Basic features: airline, origin/destination airports, distance, departure hour, day of week, month

Advanced features:

  • 🏢 Airport Traffic Index — normalized hourly flight volume per airport
  • 📊 Rolling Delay Average (3h) — mean delays at origin in the past 3 hours
  • 🌧️ Weather Severity Index — proxy from BTS weather delay attribution
  • ✈️ Previous Flight Delay — delay of the aircraft's previous route leg
  • ⏱️ Turnaround Buffer — minutes between previous arrival and current departure
  • 🕐 Peak Hour Indicator — morning/evening commuter windows
  • 🎄 Holiday Indicator — US federal holidays

📁 Project Files

File Purpose
app.py Streamlit app for interactive predictions
model_training.py End-to-end training pipeline (data → model → artifacts)
requirements.txt Python dependencies
artifacts_auc3/ Saved model, preprocessor, metadata, metrics

🏗️ Training Locally

pip install -r requirements.txt
python model_training.py --data-path "flight_data_2024.csv" --out-dir artifacts_auc3

🚀 Running Locally

pip install -r requirements.txt
streamlit run app.py

📊 Dataset

  • Source: Bureau of Transportation Statistics (BTS), 2024 US domestic flights
  • Size: ~7 million flight records, 35 columns
  • Key columns: fl_date, op_unique_carrier, origin, dest, crs_dep_time, distance, arr_delay, dep_delay, weather_delay, cancelled, diverted

📜 License

MIT

Built With

Share this project:

Updates