These are the steps we followed to complete the challenge:

1) Imported the dataset and created columns for attributes like LTV, DTI and FEDTI. Created a column for Approval status. 2) If FEDTI > 28% or LTV > 95% or DTI > 43% set the value of the approval status to 'N'. 3) If FEDTI < 28% and LTV < 80% and DTI < 36% set the value of the approval status to 'Y'. 4) If LTV is between 80-95% or DTI is between 36-43% we used a k means clustering algorithm to divide the filtered dataset into two portions based on credit score, DTI and LTV. We labeled one cluster as 'Y' approval status and one cluster as 'N'. 5) After the data is thus labeled we determined the % of approved and not approved rows in the dataset and created the graphs. 6) Used .... classification algorithm on the now labeled dataset. Test data - Train data - Accuracy - 7) Developed a simple web application to predict whether a user's home loan request will be approved or not; and if not what are the major obstacles.

Share this project:

Updates