Inspiration
For many, booking a flight feels like a game of chance. We are often told to "book on Tuesdays" or "use incognito mode," but these are anecdotal myths. We were inspired to look past the surface-level noise and uncover the structural factors of airfare. We wanted to know: Why does a 500-mile flight sometimes cost more than a 1,500-mile flight? By analyzing U.S. Department of Transportation data, we set out to quantify the "hidden taxes" created by market monopolies and the "price ceilings" created by budget competition.
What it does
Our project provides a predictive framework to determine the "fair fare" for any given domestic route. - By inputting distance, passenger volume, and carrier market share, our model can:
- Identify "Fortress Hubs" where consumers pay a premium due to limited infrastructure access.
- Quantify the "LCC Effect"—the exact volume of low-cost seats required to force a legacy carrier to lower their prices.
- Flag "Overpriced Frontiers"—routes where actual fares significantly deviate from what the operational costs (distance/volume) justify.
How we built it
We built the analysis using a phased Ordinary Least Squares (OLS) Regression approach in Python. The methodology followed a five-stage evolution:
- Core Factors: We modeled the relationship between distance and passenger volume, applying log-log transformations to account for diminishing marginal costs:
- Feature Engineering: We created a custom metric, lcc_volume, which represents the physical supply of budget seats, and hub_power, an interaction term capturing the dominance of an airline at route endpoints.
- Refinement: We integrated fixed effects (carrier and year dummies) to strip away brand-specific pricing and macro-inflationary trends.
- Final Model: Our final iteration achieved an adjusted R^2 of 0.727, explaining nearly 73% of fare variation.
Challenges we ran into
- The "Hub" Complexity: Simply having a high market share doesn't always lead to high prices. We had to engineer interaction terms to distinguish between a "busy airport" and a "dominant hub," where one airline controls the gates and slots.
- Data Latency: Working with quarterly DOT reports meant we had to account for seasonality. A "cheap" flight in quarter 1 might be a "demand surge" flight in quarter 2.
- Non-Linearity: We initially used linear variables, but the model's accuracy surged once we realized that the relationship between distance and cost is logarithmic—the first 500 miles are significantly more expensive to operate than the last 500.
Accomplishments that we're proud of
- The 73% Threshold: Reaching an Adjusted R^2 of 0.727 using purely structural market data (without knowing real-time fuel prices or daily demand) proves how predictable airfare actually is.
- The "Price Ceiling" Discovery: We successfully quantified that LCCs don't just "lower prices"—they act as a hard ceiling. Our model shows the specific volume threshold at which legacy carriers are forced to stop extracting "monopoly taxes."
- Visualizing the Gap: We created a 2x2 market evolution grid that clearly shows which routes have moved from "Overpriced" to "Efficient" over the last four years.
What we learned
We learned that infrastructure is Power. In the airline industry, owning the gates at a hub is more valuable than having the fastest plane. We also learned that for students and families, the best predictor of a "fair price" isn't the airline's brand but the presence of a high-volume budget competitor on that specific route. Statistical significance in our model across all major drivers confirmed that the "physics" of the market usually outweighs the "marketing" of the airlines.
What's next
In the next version, we plan to
- Integrate Fuel Spot Prices: To see how much of a fare hike is a "cost-push" versus "profit-pull."
- Airport Slot Analysis: Incorporate physical gate constraints to better predict where new LCCs can actually enter the market.
- Consumer Tooling: Turn our OLS coefficients into a user-friendly "Fare Fairness Calculator" for travelers to check if they are paying a "Hub Tax" before they click buy.
Log in or sign up for Devpost to join the conversation.