Inspiration
We were interested in this dataset because of the spatial elements. Well performance was closely correlated to a well's physical location, but we had few spatial attributes other than the wells themselves. This made for a challenging and interesting problem.
What it does
Our app allows users to upload a dataset of wells and receive predictions via our model in .csv form.
How we built it
We built and trained our model in Python using scikit-learn, then created our app using streamlit.
Challenges we ran into
Feature engineering proved difficult on this project, and we spent a considerable amount of time trying to build spatial and/or physical features which were stronger predictors of oil production than the original provided features. We also had to sort out evidently unimportant categorical features from important spatial features, and understand the terminology involved in oil production, an area with which none of us are familiar.
Accomplishments that we're proud of
Despite the unfamiliar domain and messy dataset, we managed to generate interesting visualizations and tried some creative approaches to feature engineering. Our final model reflects this result.
What we learned
We did a little reading on the process of oil production and learned about feature engineering for geospatial data.
What's next for Well, Well, Well
Our model has little use for the many categorical variables provided. Finding a way to transform these into meaningful features would massively improve our tree-based models.
Built With
- python
- scikit-learn
- streamlit
Log in or sign up for Devpost to join the conversation.