Rice Datathon 2024 Chevron Track

What it does

We settled on a random forest model in order to predict the peak oil rate of various wells.

How we built it

We experimented with MLR, Lasso Regression, and random forest models using sci-kit learn in order to generate our predictions, each with varying success. Ultimately, the random forest model proved to give the best results.

Challenges we ran into

Feature selection, the collinearity of predictor variables, and data cleaning were all challenges that we had to find solutions to. Ultimately, random forests gave us the best solutions to these problems under the time constraint of the Datathon.

What we learned

We learned a great deal about various machine learning strategies and regression models, along with the strengths and weaknesses of each and the situations to use each model. We were able to get a good handle on the use cases for each and how we might use them to generate our predictions for the peak oil rate of these wells.

Built With

dplyr
matplotlib
python
r
scikit-learn
seaborn

Updates

Matthew Cihlar started this project — Jan 21, 2024 08:33 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.