What it does
Our project aims to solve a regression problem by aiming to predict the Rate of Prediction given other parameters.
How we built it
We used Python and its various libraries to create the project. We used GlueViz to create the visualizations. Before we created the model, we cleaned the data by mapping strings to unique numbers and removing outliers from the training data as well as engineering new features based on oil and gas formulas we found through our research. The models we used were:
- Neural Networks
- Gaussian Fitting
- LASSO Regression
- Ridge Regression
- Elastic Net
- Random Forests
Challenges we ran into
Reducing the loss function by changing different parameters. We were able to optimize to an RMSE of 15-20 with our best models, but it took much analysis of different variables and optimization of parameters.
Accomplishments that we're proud of
Testing many different models to find the absolute best one. Overall, we are proud of the diversity of attack methods we utilized and the final output achieved.
What we learned
We learned the modeling process, the data science pipeline, and different models such as xgboost, neural networks, and regression.
What's next for Chevron Challenge
Given more features and data, we could optimize our model even more to achieve a lower RMSE. Furthermore, we had more understanding of the domain, we could create a better model as there were certain variables in the ROP model that we did not understand how to calculate.