I am finding a place to live in New York this summer for my summer internship. Since it is my first time in NY, I do not know what living in each location of the city is like in terms of safety, the convenience of location (relative to my workplace and nearby amenities), the rent price, and etc. We thus created an app with an intuitive and simple interface to qualitatively and quantitatively analyze any location in New York and its fair AirBnB rent price to provide guidance for users looking for rent.
What It Does
homES ReInvented is an web app that evaluates any coordinate in New York to evaluate the 'living score' of the target location and provide a fair AirBnB rent price to guide users on finding good rooms at a reasonable price.
Users choose a target location they want to look at by double-clicking a point on the map. They can visualize the locations of amenities (parks, cafes, schools, hospitals, and etc.) near the target locations on the map to see if the facilities they need are around their future home. Users can also add "My Place"s whose distance from the target location would be considered into calculating the living score.
After setting up the configurations by choosing the target location and My Places, users can generate a report on the target location. The report is composed of Proximity Report, My Place Report, and the Summary. The Proximity Report displays how many amenities of different categories (mentioned above) are within 5-minute, 10-minute, and 15-minute walking distances. In other words, for each amenity, there are three data columns. The My Place Report displays the walking time, driving time, and the distance to all the My Places users chose. Finally, the Summary, which is at the top of the Report, displays the living score of the target location in A, B, C, and D grade scale and the fair AirBnB price of renting a one-bedroom apartment for a day.
How We Built It
We used React.js to build the client. Since ESRI provided with all the REST APIs to retrieve geospatial data and computations, we did not have our own server i.e. we only maintain the client. The implemented functionalities are as follows.
- Web Map Interface - Web Map is integrated to the React App using esri-loader. We made a custom widget (the box on the right-bottom corner with checkboxes to display nearby amenities) and additional layers to deliver more information to the users.
- Find Nearby Amenities - Locator API is used. More specifically, we used addressToLocations endpoint to retrieve the list of locations matching the type of amenities we are looking for.
- My Place - Suggest API is used to suggest the addresses of locations that match users' inputs.
- Proximity Report - We iterated through a pre-set number of amenities and calculated the number of minutes to walk and drive to these amenities using Route APIs.
Machine Learning Model
A lot of time was spent on preprocessing data. In order to calculate the statistics for each tract of New York, we used a large number of external datasets provided by New York City (NYPD Complaint dataset, Basketball Court dataset, Pre-K School Directory dataset, NYC Health Hospitals Facilities, and many more) as well as AirBnB New York listing data. All the layers were added as Feature Layers and then aggregated to extract relevant statistics, like the number of points in each tract, to be used for training in the next step. Since all the datasets were large in size, aggregation for some datasets took almost two hours.
We then trained models with all the features (e.g. distance from grocery stores, number of schools, and proximity to parks) extracted and processed from the aggregated data. The target was the AirBnB rent price of the corresponding New York tract.
The first model we trained is GradientBoost. We obtained feature importance in the training process, and with these feature importance values, we created a mathematical heuristic to calculate the living score. We then trained linear regression using some features identified as important. Using the coefficients from the linear regression model, we calculate the estimated fair AirBnB price with the features of the target location.
What we learned
It was our first time using ESRI ARCGIS API, so it was challenging to work with geospatial data and run machine learning on them. It also took longer to preprocess than text dataset of the same size, so clever preprocessing before running ESRI functionalities were important.
This application has a large potential and many directions to develop further.
- Improve UI/UX so that college students looking for housing can actually use it.
- City Analysis - A rich resource of data provides infinite possible research questions
- Predicting land price - This application could be turned into a finance-related app that calculates and suggests numbers for commercial purposes of land usage.