What it does

We settled on a random forest model in order to predict the peak oil rate of various wells.

How we built it

We experimented with MLR, Lasso Regression, and random forest models using sci-kit learn in order to generate our predictions, each with varying success. Ultimately, the random forest model proved to give the best results.

Challenges we ran into

Feature selection, the collinearity of predictor variables, and data cleaning were all challenges that we had to find solutions to. Ultimately, random forests gave us the best solutions to these problems under the time constraint of the Datathon.

What we learned

We learned a great deal about various machine learning strategies and regression models, along with the strengths and weaknesses of each and the situations to use each model. We were able to get a good handle on the use cases for each and how we might use them to generate our predictions for the peak oil rate of these wells.

Built With

Share this project:

Updates