Inspiration

We hear about wildfires destroying acres of land, their detrimental impact on the environment, wildlife, and human life more frequently than ever before. Problems like this must be tackled promptly, and our inspiration came from contributing to this cause by leveraging data from the past. We sought to identify the data pattern and different geographic areas most affected by such events so that can be used to design strategies to handle these concerns more effectively.

What it does?

This project is aimed to provide with different visualizations and maps that shows:

  • The most affected states in the United States by wildfires.
  • The years when the wildfires were discovered had the greatest impact.
  • The number of wildfires discovered in each state during the course of a year.
  • The various causes of wildfires and their effects on various regions.
  • Which months of the year have the most fires?

How we built it?

The project was developed in multiple stages:

  1. Understanding data: Understood the data and the data features to get an idea about the overall data and filter out the part of data that is required for analysis.
  2. Cleaning and Wrangling data: Address multiple short comings of the data such as handling null value and correcting the datatypes of the features.
  3. Feature Engineering:
    • Created new fields based on the existing fields to get the better sense of data, for example: Created the field MONTH based on the existing field DAY OF YEAR.
    • Created subsets of the data to create different visualization for specific data features.
    • Used label encoding for understanding correlation between the data fields.
  4. Understanding Correlation: Created a heatmap to understand the correlation between the data fields.
  5. Data Analysis: Built different charts such as bar graph, line graph scatter plot using matplotlib and basic plotting functions by pandas to analyze the different factors contributing to the wildfire and the changes in the data over time.
  6. Data Visualization: After getting insight for our analysis, we created maps using plotly and plotted data like number of incidents per state, size of fire, changes over the ten years and causes of fire to understand the impacts of these factors over different geographic locations.

Challenges we ran into

There were multiple challenges that we faced, such as:

  • It was difficult to understand some features of the data and their impacts on the overall result.
  • One major challenge was when we created the heatmap to understand the correlation between the fields and we found out that the correlations are very weak.

Accomplishments that we're proud of

One major accomplishment that we're proud of, is that we were able to map different factors affecting the fire and its implications and find some interesting patterns such as 'Alaska' faces the most number of massive fires.

What we learned?

This project taught us a lot, some lessons learnt were:

  • Data is not always what is looks like, we need to dive deeper into each field to understand it better.
  • Sometime the outcomes of the data are not as expected, for instance, we expected the fields to be highly correlated but that wasn't the case and our heatmap proved it.
  • A lot of new insights can be drawn by simply communicating with your teammates.

What's next for Natural Science - Wildfires in the US (2000 - 2010)?

This project can be extended a lot more, some of the ideas are:

  • Including the financial features to understand the loss the fire has created, or the kind of money required to resolve this issue and how it changes over years and provinces.
  • Adding data to understand the resources required to address the problem.

Built With

Share this project:

Updates