OSISfot's data-inspired us to dig as deep as we can to see what we can make of the data from their PI system. Turns out, after removing all the noise and errors, there is a world of answers for questions similar to, "What amount of energy is a building of certain condition using and does the number of people in the building affect this amount"

What it does

It runs a lot of algorithms and statistical analysis to

  • process and clean up a huge database
  • create correlations between at least two distinctly different data sets
  • prove the existence of correlations between data features
  • display an elaborate graph of people in every building of UC Davis and their affect on the energy use of the building

How I built it

We used Python's pandas library and executed the code on jupyter notebooks as much as we could, however since the data set was large, we switched to the command line often.

Challenges I ran into

We ran into so many challenges. The data was full of surprises at every turn of our journey. Just as we thought we have a solid statistical inference, the errors and missing values of the data set proved us wrong.

Accomplishments that I'm proud of

We are proud of getting 3 years' worth of data, a quarter-hour at a time. We got two distinct data sets that match in their timestamps. Also very proud of using a lambda function to help format our data correctly so that our indices match. Also proud of using some stats to remove some outliers from the data.

What I learned

We learned to be very patient.

What's next for Cost by Occupancy and Type

We haven't built much of a front end, while it's currently possible to use Jupyter notebook to look through different graphs of the data, we look forward to adding a front end using Flask

Built With

Share this project: