Every month 20 worker is hospitalized, because of maintenance-related accidents. This leads to a loss in time, energy and capital. This statistic inspires us to solve this problem because we want our workers to be safe and healthy.
What it does
Predicts the failure of the equipment based on the sensor and determines which senor should be used to detect the failure.
How I built it
Using the Sklearn library, and applying many different models. The Decision Tree Classifier algorithm when trained over 0.8 percent of the train was able to achieve an accuracy of over 98%. The Tree performs better than SVM before any adjustment. Decision Tree algorithm only trains based on the empirical and does not try to optimize the margin between the two different classes. To take margin to take into account we implemented a Decision Tree Regressor algorithm. In theory, the regression tree algorithm provides control over the generalization error. In practice, we found that there was no difference in the error between the Classifier and Regressor. Therefore, we chose to go with a Decision Tree Classifier model.
Challenges I ran into
Computation limitation on the grid search did not have failure data.
Accomplishments that I'm proud of
Of our approach to the problem and used technology to improve to solve real-world problems.
What I learned
In our journey thought, we discovered that Random Forest Regression performed optimally in predicting equipment failer. We were also discovered that XG Boosting Classifier did not improve over the Random Forest Regression results. The team was also able to use partial component analysis(PCA) to reduce the number of features used in the data, without significant loss inaccuracy.
What's next for Predicting Failure
A possible route in the continuation of our journey is to perform statistical analysis on the obtained PCA and determine the statistical significance of the different weights. Try to implement a statistic analysis to validate our assumptions.