Flight Data Analysis

Inspirationour project's goal is to predict flight prices using a dataset containing features like airline, route, stops, and class. Over a series of iterations, we have successfully progressed from an initial non-working model to a robust, professional machine learning workflow that has produced a high-performing model and a framework for future improvements.

Evolution of our Project our work has progressed through three clear phases, each showing significant learning and improvement.

Phase 1: Initial Model and Poor Results we began by training a single Decision Tree model (dt). The initial results were very poor, with predictions being off by tens or even hundreds of thousands of dollars. The model was severely overfitting the training data, leading to a likely negative R-squared (R²) score, meaning it performed worse than simply guessing the average flight price. This is a common starting point and was valuable for identifying the problem.

Phase 2: Proper Evaluation and Strong Results Following guidance, implemented proper regression metrics. This was a critical step that allowed you to accurately measure our model's performance. After applying improvements (likely using a tuned model or a more robust one like Random Forest), we achieved the following excellent results:

R-squared (R²): 0.89

Mean Absolute Error (MAE): $4,364.64

Root Mean Squared Error (RMSE): $7,513.06

This was a major success. An R² of 0.89 signifies that our model could explain 89% of the variability in flight prices, making it a strong and useful predictive tool.

Built With

python

Updates

Saira Bano started this project — Aug 01, 2025 01:53 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.