Exploration
- Realized Pennsylvania is an oddity
- Looked at the correlation between sale_price and assessment_total_valuation
- There is a strong correlation in MA and RI but not PA
- Why would an assessor value a house so much lower for Pennsylvania? Analysis
- Made dataset with the data with a large gap between sale_amt and the assessment_total_valuation
- First hypothesis: assessors incorrectly valued farm test: acres vs (sales_amt - assessment_valuation) result: negative non-linear relationship. meaning farmlands are actually more accurate and suburbs/cities are less accurate assessment valuations
- Second hypothesis: data points with large ‘gap’ concentrated around suburb and cities test: plot latitude and longitude against sale_amt result: all around the Greater Philadelphia or Pittsburgh area Why?
Last hypothesis: Bubble in the city-suburbs of Pennsylvania test: number of sells and the number of distressed_sells and where they are result: -highest number of sells were concentrated in Philadelphia and Pittsburgh -highest number of distressed sells were concentrated in Philadelphia and Pittsburgh
insight: people are buying and selling hopes a high rate, taking mortgages, with a high default rate, similar to leading up to ’08 So what? model - predict prices including the over-inflating data points with ____% accuracy
- and predicts if the house will be distressed sell ->
- Citizens Bank could use this to better price mortgage loans
- consumers - short mortgage loans

Log in or sign up for Devpost to join the conversation.