Inspiration

We were inspired to build Hank after reading the sponsor contests and seeing that Vitech was running what we thought was a super challenging problem. We built Hank as a proof-of-concept to show that much of the advice sought from insurance professionals can be accurately generated by a finely tuned model. Considering our interest in machine learning and data visualization, the project seemed like the perfect fit. In the case of Hank, we aimed to generate plan suggestions and prices for a user through a simple survey. We believe a page like this could live on an insurance company's website and provide users with accurate quotes based off a machine learning model the company is able to tweak in real time.

What it does

Hank provides a broad set of functionality that helps save time and money for both the clients and the life insurance providers.

For clients looking to purchase health insurance, Hank provides an easy introduction to the process with a simple and friendly application that guides the client through a series of questions to gather information such as age, family status, and personal health. With this information, Hank provides an accurate quote for each of the four insurance plans, as well as making a recommendation for which plan is likely the best for them.

Hank also provides an insurance provider facing visualization that allows for the life insurance provider to tweak Hank's suggestions based on business metrics such as life time value (LTV), customer satisfaction, and customer retention, all without re-training the two machine learning models that Hank uses to provide suggestions.

How we built it

Hank's suggestion system is composed of three modules that work in series to provide the most accurate and useful data to the user.

The first module, the premium estimator, uses the gathered user demographic data to determine what the premiums for each of the four plans (bronze, silver, gold, platinum) would be. This is done with a neural network trained using tensor flow on the insurance data set provided by Vitech for this competition. Using a neural network for a mixture of continuous and discrete data allows Hank to make complex associations between user features and make accurate premium predictions.

The second module is the suggestion module. The suggestion module uses the premium pricing predicted by the premium estimator module as well as the user's demographic data to suggest which of the four plans would be the most suited to them. Because of many-dimensional nature of the data set as well as the fact that there were so many data points to use, a kth-nearest neighbor training model was applied using scipy.

The third module, the business module, is an exciting module that provides value to the life insurance provider by giving fine control over the suggestions that Hank makes to customers using simple metrics. The business module works by combining information from the first and second modules, as well as data from the World Population API to provide a data set that can be modified using simple scaling factors. Using the data-visualization front-end of this module allows the life insurance provider to tweak Hank's suggestions to better align with business goals. This would normally be a very time consuming process as the machine learning that Hank uses to make decisions would have to be re-trained and re-validated, but with the business module, the key components are pre-abstracted away from the machine learning implementation.

Challenges we ran into

Our first hurdle for Hank was downloading the dataset provided by Vitech. At 1.4m records, it was a difficult task to retrieve and store, especially on a limited connection.

Once our data was successfully scraped, our next challenge was deciding on an appropriate model to use for the machine learning aspect of the project. Many options were tried and discarded, namely Bayseian classification, support vector machines, and random forest classification. By continuously training and testing different models, we ultimately decided on two - a nearest neighbour simulation and a neural network.

Accomplishments that we're proud of

We're happy to say that we managed to deliver on what we thought would be our two biggest challenges - a pleasing and responsive UI and a meaningful data visualization.

We're especially proud of the business logic control panel which is used to modify and visualize the different goals a company wishes to optimize for.

What we learned

This project was a chance for us to learn about machine learning as well as test our ability to put ourselves in the shoes of both our client as a software developer and the users of the application, to provide special features to both. The time constraints provided by the hackathon also taught us to manage our time and communicate well - as a team of two people tackling such an ambitious project, constant communication was a must.

What's next for Hank: Your health insurance advisor

The next step for Hank is to train it against a larger, more diverse dataset and further optimize our models. Once we've trained our model against another set, we'll also be able to add additional steps to the survey, giving us more information to query with. The ultimate goal for Hank would be to host it live on an insurance company's site and see it used by real people and trained with real data.

Built With

Share this project:
×

Updates