One of the teammates spoke to somebody in real estate and learned that housing valuation is very hands-on. They only consider standard factors such as bed, bath, square feet, crime, etc. We thought that automating the valuation process would bring new light to what factors actually dictate real estate price changes. Although we did not have too much background knowledge with the real estate markets, we found some informative websites about house prices in every city in the U.S with previous transaction data, house information, and much more. We thought it would be cool to be able to predict the changing house prices in cities by taking into other hidden factors like proximity to movie theaters, coffee franchises, parks, etc.
What it does
It estimates how a house price will change based on the house's geographical coordinate system by taking into account its proximity to various "atypical" factors.
How we built it
We first selected a city, then scraped atypical factors in the cities (theaters, wholefoods, etc in the city), scraped houses, preprocess the data so it is ready. Then, we developed a scoring function for each category of atypical factors and traditional factors for each house. Using those scores as variables, we implemented a multiple variable linear regression to determine "weights" of each atypical factors. Finally, we improved the accuracy of the model by better processing the data and finding optimal scoring functions through random sampling.
Challenges we ran into
As ambitious as this project is, we ran into many problems along the way. Since we had no previous experience with web scraping, we had to learn it as we did it. Initial accuracy was disappointing, and we had to make adjustments and figure out better ways to improve accuracy.
Accomplishments that we're proud of
We are proud of the out improvement in terms of accuracy (~55% -> 94%).
What we learned
- How to web scrape with Selenium, BeautifulSoup4
- How to run a machine learning model
- The complexity of the nature of house price prediction
What's next for Predicting Real Estate Value Using Atypical Factors
There is a lot of room for improvement for this project since there are countless factors that affect the price of a house. As our next step, we are considering researching and adding other atypical factors to better estimate house prices. Also, we might look into other machine learning algorithms for house price prediction.