One team member was looking for Airbnbs for her stay in Toronto after Hack the North and realized there was no way to determine how safe the surrounding area was. She had to scroll through numerous reviews for various listings and visit online forums to decide where to book an Airbnb. Safety plays a huge role in the Airbnbs that we book, so why can't we search for Airbnb's based on how safe they are? What if there was a way to search for Airbnb's in low crime areas that had also been vouched for by previous guests?
What it does
The web app shows Airbnb listings for areas searched for by the user and presents them in decreasing order based on a safety index. The safety index was determined by analyzing publicly available crime data for Toronto over an extended time period and the presence of certain keywords like "safe, quiet, scary" in reviews that indicated safety or lack of safety in the Airbnb's area.
How we built it
Data analysis was done in python, on two main data sets: crime data for Toronto and Airbnb listings/reviews in Toronto. The IBM Watson Natural Language Understanding API to analyze sentiment of reviews for Airbnbs in Toronto.
Firebase was used on the backend with Real Time Database to store the data, Authorization to handle auth, Hosting to host the app and Functions for serverless functions.
The frontend was written in Typescript using Angular.
Challenges we ran into
It was difficult to find a good way to represent relative crime rates, and assign appropriate weights to them, in factoring the safety rating.
We also had difficulty determining how to calculate a relevant satisfaction rating from the sentiment analysis of the reviews. Also, it was difficult to find negative reviews and weight them appropriately.
Airbnb's API is private, so we were forced to rely on public data dumps.
Accomplishments that we're proud of
We were able to successfully scrape publicly available crime data and augment with insights gleaned from sentiment analysis.
What we learned
How to use Firebase and the IBM Watson APIs. Sentiment analysis is hard! Especially with the length of the reviews, when we want to extract keywords.
What's next for safe-airbnb
Expand to other cities, update the data in real-time instead of batch.