Inspiration

Given the recent rise in Forest Fire incidents in the state of California, I was inclined towards discovering ways in which these events would be detected and necessary steps to be taken to avoid mass destruction.

What it does

My program is a proof of concept of Geo-spatial analysis that can be used to detect and classify different features of an image obtained by a satellite. I was particularly interested to know how Support Vector Machines, Random Forests and Naive Bayes perform in classification and to find the most effective model among those.

Methodology:

1) A subset of multi-spectral images is obtained from Sentinel-2 from the Copernicus Open Data Hub. The sub-set images covers most of north campus of University at Buffalo. 2) Particular bands/channels were selected for accuracy purposes from the images. 3) The spatial data is processed before applying machine learning algorithms. 4) The models are trained using 5 feature classes extracted from Google Earth imagery: i) Road/pavement, ii) Building, iii) Trees, iv) Water bodies. 5) QGIS is used to create polygons representing members of these classes. The polygons are first converted to 2.5mx2.5m raster grid and them convert them to spatial points. 6) Then values from Sentinal-2 bands B2, B3, B4, B5, B6, B7, B8, B8A, B11 and B12 are extracted and added to the point data set that will be used for training, validation and test models. 7) Prediction grid dataset is created by converting all raster bands to spatial point data frame and then to a CSV file. 8) The grid point data file is used for prediction of land-use classes.

Challenges I ran into

1) Sentinel-2 data is acquired in 13 spectral bands in the VNIR (visible-near IR) and SWIR (short wave IR). There was a lot a noise in the images. These images were radiometric and atmospherically corrected using Sen2Cor, a Python based processor developed by ESA for Sentinel-2 product of formatting and processing. 2) The bands were re-sampled at 10m resolution in Sentinel Toolboxes. 3) For on screen digitization, there is a need to transfer this data onto QGIS and learn the software.

Accomplishments that I'm proud of

I learnt a bunch of new terms and techniques that is used to pre-process data, which are mainly associated to this domain. The most interesting of which would be that there exists algorithms that accurately projects data from different projection systems to a 2-D representation of the vector and raster. Developing a web application for future support and information delivery to users gave this project a nudge towards distributed systems, a key feature at top technology firms. Moreover, I am confident that employing even more powerful techniques such as Deep Neural Network-Keras and TensorFlow, we can predict natural disasters at a very reliable rate.

What I learned

This project taught me how to scrape and process satellite image data. I learnt how to use third party software such as Sen2Cor for formatting and processing the data, QGIS for on screen digitization and Sentinel Toolboxes to re-sample the data more accurately. using web services was also a learning outcome in this project. Using Google's BigQuery to drill into the JSON data obtained from FEMA made it easier to obtain the relevant bits of information, cutting the size of the ingested data.

What's next for Satellite Image Classification

Building Deep Neural Networks and TensorFlow models to improve and for benchmarking purposes. With such powerful tools, governments can nip the threat in the bud or atleast take evasive actions in time. Sending alerts based on hazardous environment directly to the user is something that has not yet been implemented efficiently. In fact, the Tsunami of 2018 in Indonesia is a glaring example that systems on the ground are not very reliable since they are subject to theft/vandalism.

Built With

Share this project:

Updates