Machine learning can be leveraged on historic data to produce powerful predicative models. With a knowledge of machine learning algorithms and a can do hackathon attitude we tested a number of models and machine learning methods on an open sourced data set.

What it does

We have a web site. (pun intended with the glu for glucose) this site is really just a place where we are presenting our findings. What this project does is build a number of predictive models which could be used to identify and predict if individuals had diabetes. Possibly helping to prevent and manage on of the most prevalent chronic disease in the United States.

How I built it

We used a combinations of tool including python and weka for the machine learning aspects and data cleaning and prep. For the website we used an amazon aws server configured into a LAMP stack with automatic deployment from github.

Challenges I ran into

We had to learn new machine learning algorithms on the fly. More specifically we investigated and learned about bagging and boosting algorithms. How ever I am actually really happy we ran into these problem because we got to explore some cool new aspects of machine learning.

Accomplishments that I'm proud of

Finding an open sourced data set. We spent almost a day and a half find a data set.

What I learned

Meta algorithms such as bootstrapping and boosting.

What's next for

More data and Hadoop!

Share this project: