Refuel, Today?

How many cars are there?

Inspiration

Our team was greatly inspired by this project from Chevron because it offered a new perspective on how even small initiatives can hold significant potential for adding business value and positively impacting our society.

What it does

By forcasting the number and types of vehicles that will be on the road in 2024, we can estimate future fuel demand and emerging trends in electric vehicle adoption. This data-driven approach supports infrastructure investment decisions and helps optimize resource allocation for gas and charging stations.

How we built it

We started our project with basic data exploration to identify trends and address missing values using K-Nearest Neighbors (KNN) imputation. We also implement feature engineering and one-hot encoding to make our dataset more suitable for our model. For modeling, we experimented with Decision Trees and Random Forest before selecting XGBoost. Using Grid Search, we fine-tuned hyperparameters to optimize model performance, ultimately achieving a Root Mean Squared Error (RMSE) of approximately 6000, which accounts for only 2% deviation from the total range.

Challenges we ran into

We faced several significant challenges, including handling missing and inconsistent data, as well as dealing with unexplainable results. Predicting numeric outcomes with a dataset that primarily contains categorical variables is not an easy task already. On top of that, we needed to find effective solutions for managing the problematic aspects of the dataset. Even more frustrating was that despite our extensive efforts to clean the data, our model did not show much improvement; in fact, it performed worse.

Accomplishments that we're proud of

We have two main achievements to be proud of. First, we achieved a respectable prediction result without investing much time in tuning the model. Second, we successfully identified Vehicle Category and Fuel Technology as the most influential predictors, providing us with valuable insights.

What we learned

The project enhanced our ability to process data and highlighted features and requirements we need to consider when applying the model. We also increased our expertise in feature selection and tuning hyperparameters. Furthermore, analyzing vehicle trends allowed us to understand how various factors influence retention rates and future fuel demand.

What's next for Rice Datathon-Chevron

We aim to enhance our model by exploring deep-learning techniques for improved accuracy. Additionally, we hope to collaborate with Chevron's research team to uncover valuable patterns that facilitate strategic planning.

Built With

excel
jupyter
python

Updates

Huang Zi Yi started this project — Feb 02, 2025 06:01 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.