Hojun Chan, Brian Yeh, Carin Yao, Prudential HACKRU Challenge 2
test <- read.csv('C:/Users/hojun/Desktop/HackRU_testData2.csv', row.names=1) train <- read.csv('C:/Users/hojun/Desktop/HackRU_trainData.csv', row.names=1)
Reads in the data
Had to clean up the test data cause it was missing ID column and to put in empty column for Lowest Risk
model1 = glm(LowestRisk ~ ., family = binomial(logit), data = train) summary(model1)
Summary of the Linear Model
Small p-values , reject null, shows which variables are significant
More asterisks on the output p value = more significant
AIC_criteria <-step(lm(LowestRisk~.,data = train),direction = "backward") summary(AIC_criteria)
Removes the variable that are insigificant to predicting the Lowest Risk using the
AIC Akaike's 'An Information Criterion' function stepwise regression (backwards)
Regression Equation -> Variables that are significant to predicting whether a customer is Low Risk:
BIC can also be used to see which variables are important
pred <- predict(model1, test, type = "response") df = data.frame(pred) head(df)
pred2 <- predict(AIC_criteria, test, type = "response") df2 = data.frame(pred2) head(df2)
Log in or sign up for Devpost to join the conversation.