What inspired me was -This Project has a noble idea of helping students perform better after Graduation. And how universities could study student history (10th and 12th Grades) to identify the key attributes that influence student performance in order to provide better guidance.
What it does - It identifies top 40 predictors (total : 3345) that mostly influences job satisfaction after graduation
How I built it - I built it using R language - Divided the dataset into 2 parts based on columns. Considered records that had all the information, identified the response variable and used PLS (Partial Least Square) Regression to get the list of important variables.
Challenges I ran into - The dataset had many columns with NA's, numeric and categorical variables, negative values for variables and columns with very high correlation
Accomplishments that I'm proud of - I decided to consider the entries that had all the information and no NA's, took care of the categorical variables, replaced negative values with zero, removed near zero variables, removed highly correlated variables before applying Linear Regression techniques.
What I learned - That every dataset is different and needs to be pre-processed accordingly.
What's next for Student Success Analytics - Accenture - Consider other KPIs as response to identify variable importance