Inspiration

We wanted to develop a model that predict the level of interested rate for an approved mortgage, and to investigate whether partitioning the dataset by race can improve predictive accuracy.

How we built it

We first built our own linear regression model. Then we tried the statsmodels package and found out the R squared was about the same. We then further partitioned the population to improve predictive accuracy.

Challenges we ran into

Our first model has less than 0.4 R squared. However, by stratifying races, eliminating outliers, and removing some correlated variables, such as "ages" and "above 62 years old", we improve the precision of the model to an R square of 0.526.

What's next for Interest Rate Prediction Model with An Emphasis on Race

By partitioning race, we were able to improve precision in predicting the level of interest rates, so there is definitely more to explore about how to partition the entire population in order to give better predictions. Further, we could explore why do data from the minority races yields higher R^2 value, and better reveal the discrimination underlying such difference.

Built With

Share this project:

Updates