The Data Bandits chose to work with GE's large data set for data analysis and visualization. We used SQLLite to query the data in Tableau with results illustrating our data exploration. We built a rudimentary data analysis in Python using Scikit-learn with a goal to model the time series and detect anomalies but to no avail. Instead of looking at statistical anomalies, we looked at data anomalies. Our goal was to create some visuals and test various hypotheses and dab into some forecasting, but since the data sets were ambiguous, it was difficult for the Data Bandits to define characteristics, such as what is a failure. That was arguably the most time consuming part. That and the shear volume of the data on our laptops with limited memory.
Instead of running the analyses locally, we should have used cloud services to run the data.