Inspiration

We were interested in this dataset because of the spatial elements. Well performance was closely correlated to a well's physical location, but we had few spatial attributes other than the wells themselves. This made for a challenging and interesting problem.

What it does

Our app allows users to upload a dataset of wells and receive predictions via our model in .csv form.

How we built it

We built and trained our model in Python using scikit-learn, then created our app using streamlit.

Challenges we ran into

Feature engineering proved difficult on this project, and we spent a considerable amount of time trying to build spatial and/or physical features which were stronger predictors of oil production than the original provided features. We also had to sort out evidently unimportant categorical features from important spatial features, and understand the terminology involved in oil production, an area with which none of us are familiar.

Accomplishments that we're proud of

Despite the unfamiliar domain and messy dataset, we managed to generate interesting visualizations and tried some creative approaches to feature engineering. Our final model reflects this result.

What we learned

We did a little reading on the process of oil production and learned about feature engineering for geospatial data.

What's next for Well, Well, Well

Our model has little use for the many categorical variables provided. Finding a way to transform these into meaningful features would massively improve our tree-based models.

Built With

Share this project:

Updates