This project was inspired by the "Conflict for Customers," where families and students are often priced out of "Fortress Hubs" due to market concentration rather than flight distance. We built the project using a multi-staged analytical approach, starting with preliminary EDA using Pandas and Seaborn to discover basic trends. We then developed three distinct simulation models—a Baseline, a tuned Random Forest, and a Gradient Booster—integrating SHAP values and log-distance transformations to quantify the structural cost of airline monopolies.

Through this process, we learned that physical geography is losing its predictive power, as the correlation between distance and fare dropped significantly from 2022 to 2025. Instead, market structure variables now explain an additional 28 percentage points of airfare variation. One of our primary challenges was accounting for high-impact anomalies, specifically the 2022 pricing data which had to be analyzed as a COVID-19 recovery outlier to avoid skewing our seasonal trend models. Ultimately, this data-driven journey allowed us to identify "Rip-Off Routes" and provide actionable "Student Budget Hacks" to help travelers navigate a complex pricing landscape

Built With

Share this project:

Updates