Inspiration

Introduction

We want to empower businesses to adopt image recognition technology to solve valuable business problems. In this article, I will outline the benefits of building a satellite image recognition algorithm that can identify key building features. In subsequent posts, we will get into the technical detail of building these algorithms.

There’s been a lot of discussion about using advanced analytics techniques to identify image features. Some of the more fun ones can distinguish between different animals, and some of the more infamous ones can identify individuals out of the crowd. I wanted to create one that avoids the creepiness, and ethical questions of facial recognition but something that will be more useful to businesses than identifying what kind of dog it is.

Real estate is one of the largest decisions that companies and individuals make. Since this decision is so large and rare, the building features data is collected one at a time with onsite visits. But for those companies who make hundreds or thousands of real estate decisions each year, in-person real estate data collection becomes time-consuming and expensive.

Image recognition algorithms can substitute the mundane work of looking up building features on google maps or searching municipal records, with limited labor costs and consistency and efficiency of AI.

Research Purpose

The purpose of this research is to create an algorithm that can identify external building features from a satellite image. There are thousands of building features that are used in making commercial real estate business decisions. Most of these features are collected with expensive site surveys. Out of all the possible building features, we have identified four features — usage, layout, drive-thru, and area — for our study. These have business value and are available in public datasets. We initially built a CNN classifier to predict the building’s usage category from the pixels of a satellite image. Next, we built a CNN classifier that predicts the building layout category from a similar image. And then, we then built a CNN classifier model that predicts if the site has a drive-thru. Finally, we built a model that predicts the building shape area in square feet.

There are numerous real estate analytic applications of these models. These algorithms allow users to classify and measure building features automatically based on Google Map images thereby replacing costly human data collection surveys. This data can be sold and used to improve search and analytics of commercial real estate decision-making. This new knowledge about buildings will allow us to update datasets where these features were missing. Also, we will be able to automatically filter new listings based on these features to focus on ideal sites. These models can also be leveraged in mergers and acquisitions where we can analyze these features for an entire portfolio of buildings without having to look them up one by one.

Background

Commercial Real Estate

Building features are critical factors to make business real estate decisions. There are hundreds of different features that are collected with in-person site surveys or from outdated municipal databases. We selected four features in this research based on the data available to us and their business importance, but similar methodologies may apply to predict others.

The commercial real estate industry is slowly expanding the use of data analytics to make decisions about offices, retail stores, restaurants, and industrial sites. A large part of making these decisions is identifying various building features. We can look up the location of almost any building, but we know very little about the building itself. Building features play an important role in understanding the value of the building. According to our industry partners, these features are not consistently recorded for the majority of buildings. Because there are relatively few transactions in real estate by company, most decisions are treated as one-off situations that do not require comprehensive record-keeping. This then causes inconsistent and incomplete data records.

The algorithms described in this paper allow us to automatically classify some of these currently unavailable or unknown building features. This ability will provide businesses and individuals significant value because real estate is still one of the most significant expenditures. Getting more robust data on building features will help businesses identify potential needs, risks, and opportunities.

There are hundreds of building features that are tracked and analyzed by real estate decision-makers. Different building features are more important for some industries than others. For example, quick service-restaurants may care a lot more about drive-thru and patio data than a retail store or a gym that may care more about the size, accessibility, and parking.

Currently, there are several different manual methods businesses use to collect this data. Most of these data can be obtained during site visits, but it is expensive to travel to thousands of sites across the country. Some of these data can be found in internal building plans, but this process is time-consuming and not always possible. Some of the building features can be found in municipal databases, but the record-keeping standards vary greatly, and contacting multiple municipalities makes this also an expensive exercise. Some of these features have been obtained by sending out surveys to building managers, but different people may have different standards for filling out the surveys causing inconsistency in classifications. Some companies have turned to Google Maps and Street view site audits. This is cheaper but still requires analysts’ time and not all features are available on the Internet.

All these collection techniques are time-consuming, costly, and often inconsistent especially when we want to analyze thousands of locations across the country. Artificial intelligence building feature recognition can help with some of the costs and consistency of gathering these building features data.

State of Image Recognition

We examined the performance of other Image recognition algorithms to understand how our algorithms will compare to them. With advances in computer processing, available data, and analytics techniques in the last few years, we have seen great advances in image recognition techniques. For example, improvements have been made in a range of areas from facial recognition to recognizing written digits to being able to classify specific images from among the thousands of objects that might be in it. More specialized algorithms can identify images with accuracy in the upper 90% range. Kharkovyna (2019) in Towards Data Science blog claims that Facebook’s DeepFace is 97.5% accurate. Bhobé (2018) shows that using Keras CNNs can classify digits with over 99% accuracy. Thompson (2016) shows that similar Keras CNN models classify images with 1,000 classes with 88% accuracy.

Even though impressive rates of accuracy have been achieved, the accuracy slightly diminishes as these algorithms become more generalized. Enge (2019) in table A compares the performance of multi-class algorithms available on different cloud platforms. We see that general multi-classification predictions are between 70% to 90% accurate. Even Facebook’s Deep Face breaks down a bit when classifying images of nonwhite males according to Lohr (2019).

However, current algorithms on the cloud can only identify that our images are a satellite image. They are not specific enough to recognize building features in the image. We build similar algorithms that can be used to classify and measure building features, and we will use the 70% to 90% accuracy range of generic algorithms to compare our model accuracies.

Model Example

The first algorithm we build was to identify if a fast food (QSR) restaurant has a drive-thru. This is a simple question any human can answer even looking at it in the aerial view on google maps, but we wanted to see if we can train an AI algorithm to do the same. To start we collected thousands of addresses of QSR restaurants which took searching through hundreds of publicly available data sources and web crawlers. Then we downloaded satellite images of those restaurants.

Fast food (QSR) restaurant with a drive-thru With the complete data set of drive-thru yes / not tags and corresponding images, we build a Convolutional Neural Network (CNN) algorithm that is can identify if a restaurant has a drive-thru with a 72% accuracy. The algorithms of CCNs have long been established but it’s great to see that by turning them to a different problem like satellite image recognition we still see comparable accuracy measures.

Business Value of Sattelite of Image Recognition

Obviously, no individual consumer will fire up a CNN just to see if the burger joint on their drive home has a drive-thru. That being said a business professional might need to sort through thousands of real estate listings and find all of the locations that already have a drive-thru, thus running this algorithm can be a lot more efficient than visiting each site or even looking them up on a map.

When your business needs to quantify how much is a drive-thru worth or when acquiring a portfolio of real estate this algorithm can save you countless hours of manual site audits. As a business decision-maker, you may already know this information for your brand but it can be beneficial to find this kind of information about your competitors.

Whether a QSR has a drive-thru is just one feature we are using to test these algorithms to see if AI data collection is a feasible and useful application in the latest development of computer vision. Future posts will describe how we can identify if a building is residential, or industrial, how we can find freestanding sites, and even measure the building area. Let us know if you have a building feature that you want to identify using AI vision.

This article describes the work that was done for a capstone project for the master’s degree in Analytics at the University of Chicago with Neeraj Tadur.

What it does

Built With

Share this project:

Updates