Inspiration

We were inspired by recent freak weather events that have greatly affected the Houston community. For this reason we felt that a data science pipeline that could help visualize weather data was an impactful problem.

What it does

Our code takes in the raw data files via web endpoints and wrangles the data into a pandas DataFrame form and then integrates the three different streams of data into a cohesive time series plot that accounts for the geospatial properties of the meteorological sites.

How we built it

We built our pipeline using Pandas, NumPy, matplotlib, and basemap. In terms of data wrangling and pre-processing we mostly wrote our own custom functions and for time alignment we wrote our own algorithms.

Challenges we ran into

One of the main difficulties we ran into was drawing a dynamic vector space on top of the static map. We could not dynamically redraw the map due to the overhead of drawing the map. This required us to draw vectors at each time and then remove those vectors before the vectors for the next time in the time series were drawn.

Accomplishments that we're proud of

Pulling an all-nighter and committing to learning more about common packages and development tools in data science.

What we learned

We learned how to sift through documentation for various python packages like matplotlib and how to test small feature changes in our code

What's next for Cognite Rice Datathon 2022 Challenge

We hope to continue this project after the datathon and add customization so that the data can be sliced into any time period that the user is interested in

Built With

Share this project:

Updates