title: Flight Delay Risk Predictor emoji: ✈️ colorFrom: indigo colorTo: blue sdk: docker app_file: app.py pinned: false license: mit
short_description: AI-powered flight delay probability estimation
✈️ Flight Schedule and Delay Prediction
AI-powered flight delay risk prediction system trained on 7M+ US domestic flights from the 2024 Bureau of Transportation Statistics (BTS) dataset.
🎯 What It Does
Enter flight details (airline, route, time, weather conditions) and get:
- Delay probability — likelihood of ≥15 minute arrival delay
- Risk category — Low / Medium / High
- Operational indicators — airport congestion, rolling delay averages, peak hour status
🧠 Model Architecture
| Component | Details |
|---|---|
| Best Model | Logistic Regression (balanced) |
| ROC-AUC | 0.917 |
| Accuracy | 89.5% |
| F1 Score | 0.769 |
| Preprocessing | OneHotEncoder (categorical) + StandardScaler (numeric) |
| Target | Arrival delay ≥ 15 minutes (binary) |
Feature Engineering Pipeline
Basic features: airline, origin/destination airports, distance, departure hour, day of week, month
Advanced features:
- 🏢 Airport Traffic Index — normalized hourly flight volume per airport
- 📊 Rolling Delay Average (3h) — mean delays at origin in the past 3 hours
- 🌧️ Weather Severity Index — proxy from BTS weather delay attribution
- ✈️ Previous Flight Delay — delay of the aircraft's previous route leg
- ⏱️ Turnaround Buffer — minutes between previous arrival and current departure
- 🕐 Peak Hour Indicator — morning/evening commuter windows
- 🎄 Holiday Indicator — US federal holidays
📁 Project Files
| File | Purpose |
|---|---|
app.py |
Streamlit app for interactive predictions |
model_training.py |
End-to-end training pipeline (data → model → artifacts) |
requirements.txt |
Python dependencies |
artifacts_auc3/ |
Saved model, preprocessor, metadata, metrics |
🏗️ Training Locally
pip install -r requirements.txt
python model_training.py --data-path "flight_data_2024.csv" --out-dir artifacts_auc3
🚀 Running Locally
pip install -r requirements.txt
streamlit run app.py
📊 Dataset
- Source: Bureau of Transportation Statistics (BTS), 2024 US domestic flights
- Size: ~7 million flight records, 35 columns
- Key columns:
fl_date,op_unique_carrier,origin,dest,crs_dep_time,distance,arr_delay,dep_delay,weather_delay,cancelled,diverted
📜 License
MIT
Built With
- dockerfile
- python

Log in or sign up for Devpost to join the conversation.