One of the teammates spoke to somebody in real estate and learned that housing valuation is very hands-on. They only consider standard factors such as bed, bath, square feet, crime, etc. We thought that automating the valuation process would bring new light to what factors actually dictate real estate price changes. Although we did not have too much background knowledge with the real estate markets, we found some informative websites about house prices in every city in the U.S with previous transaction data, house information, and much more. We thought it would be cool to be able to predict the changing house prices in cities by taking into other hidden factors like proximity to movie theaters, coffee franchises, parks, etc.

What it does

It estimates how a house price will change based on the house's geographical coordinate system by taking into account its proximity to various "atypical" factors.

How we built it

We first selected a city, then scraped atypical factors in the cities (theaters, wholefoods, etc in the city), scraped houses, preprocess the data so it is ready. Then, we developed a scoring function for each category of atypical factors and traditional factors for each house. Using those scores as variables, we implemented a multiple variable linear regression to determine "weights" of each atypical factors. Finally, we improved the accuracy of the model by better processing the data and finding optimal scoring functions through random sampling.

Challenges we ran into

As ambitious as this project is, we ran into many problems along the way. Since we had no previous experience with web scraping, we had to learn it as we did it. Initial accuracy was disappointing, and we had to make adjustments and figure out better ways to improve accuracy.

Accomplishments that we're proud of

We are proud of the out improvement in terms of accuracy (~55% -> 94%).

What we learned

  • How to web scrape with Selenium, BeautifulSoup4
  • How to run a machine learning model
  • The complexity of the nature of house price prediction

What's next for Predicting Real Estate Value Using Atypical Factors

There is a lot of room for improvement for this project since there are countless factors that affect the price of a house. As our next step, we are considering researching and adding other atypical factors to better estimate house prices. Also, we might look into other machine learning algorithms for house price prediction.

Share this project:


posted an update

We just finished building our machine learning algorithm! We will keep you posted soon about how we were able to cleverly extract data and bring tremendous insight to the housing market!! We are now going to implement our algorithm in a larger city - Seattle, WA.

Log in or sign up for Devpost to join the conversation.

posted an update

We built a webscraper using Selenium/Beautiful Soup on Python to gather all of our data for this real estate project, and now we are coding our machine learning portion. We got data for Mount Pleasant, SC and Seattle, WA. we will keep you all updated on our accuracy results from machine learning!!

Log in or sign up for Devpost to join the conversation.