We were inspired by recent freak weather events that have greatly affected the Houston community. For this reason we felt that a data science pipeline that could help visualize weather data was an impactful problem.
What it does
Our code takes in the raw data files via web endpoints and wrangles the data into a pandas DataFrame form and then integrates the three different streams of data into a cohesive time series plot that accounts for the geospatial properties of the meteorological sites.
How we built it
We built our pipeline using Pandas, NumPy, matplotlib, and basemap. In terms of data wrangling and pre-processing we mostly wrote our own custom functions and for time alignment we wrote our own algorithms.
Challenges we ran into
One of the main difficulties we ran into was drawing a dynamic vector space on top of the static map. We could not dynamically redraw the map due to the overhead of drawing the map. This required us to draw vectors at each time and then remove those vectors before the vectors for the next time in the time series were drawn.
Accomplishments that we're proud of
Pulling an all-nighter and committing to learning more about common packages and development tools in data science.
What we learned
We learned how to sift through documentation for various python packages like matplotlib and how to test small feature changes in our code
What's next for Cognite Rice Datathon 2022 Challenge
We hope to continue this project after the datathon and add customization so that the data can be sliced into any time period that the user is interested in
Log in or sign up for Devpost to join the conversation.