datathon

Data Visualization: We found 54 outliers that have rates of penetration above 300 RMPs. We chose to eliminate these outliers from our training data set because of the way they interfere with our correct data which have lower rates of penetration. See datathon.m for the correlation matrix, and scatter plots of features corresponding to rate of penetration. (see datathon.m)

Modeling: We used linear regression and the knowledge of our outliers combined with the engineering of two new features, to predict the rate of penetration for every data point. (see datadriven3.m)

Built With

jupyter-notebook
matlab

Updates

Julia Coyner started this project — Jan 25, 2020 03:23 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.