People filling online surveys don't always do so with sufficient effort. When a company gets hundreds or thousands of survey responses, they have to filter out the "insufficient effort responses" manually before they can perform any further statistical analysis. I am using machine learning to automate the process of filtering out the "insufficient effort responses".
What it does
IER Buster has a browser interface for surveyors to select the survey they want to process. The surveyor can then use the browser interface to designate some survey questions as "consensus questions" (which have consensus responses) or "related questions" (which have related responses). IER Buster's server side will then fetch the selected survey's responses from SurveyMonkey, and uses the "logstic regression model" to perform binary classfification on the responses. The algorithm-identified "insufficient effort responses" are then sent back to the browser for the surveyor to confirm. Once confirmed, the identified "insufficient effort responses" will be deleted by IER Buster from SurveyMonkey. The surveyor's confirmation will also be used as training data for IER Buster to improve the logistic regression coefficients using stochastic gradient descent. The more organizations use IER Buster, the more accurate the IER Buster's logistic regression model will be in classifying survey responses as "insufficient effort responses" (IER) or non-IER.
IER Buster currently makes the IER or non-IER binary categorization based on 5 factors: response time, open text responses, consensus question responses, related question responses, and question response patterns.
How I built it
IER Buster's back-end is powered by Spring Boot. Its front-end uses the Angular framework.
Challenges I ran into
Coming up with model coefficients that are meaningful for survey responses is quite a challenge. Also the debugging of both browser-side code and server-side code is very time-consuming.
Accomplishments that I'm proud of
I am proud of being able to solve a pratical real-world problem.
What I learned
I previously had a misconception about machine learning, thinking that it is hard to understand and only people with PhDs can do it. However, when I took a step back and read about logistic regression, I realized that I can do it too.
What's next for IER Buster