Candy Clustering with K-means

In this project, I will use the K-Means algorithm to do the clustering of candies. Imagine that I am the CEO of a candy company. I want to create a new product. However, I am fairly conservative, so I do not want to make a radically new product. Instead, I want to make a product that is similar to other best-selling candies. The way to do this is simple. I look at all of the candies in the market and cluster them by properties like sweetness, sourness, and nuttiness. I pick one of the centroids of this cluster (the cluster with the absolute best sellers), and that is the candy that I should make. The market research has been provided for this project. The features are:

The name of the candy. It will almost always have spaces. The candy sweetness. This is a float between 0 and 1 (where 0 = not sweet, 1 = extremely sweet). The candy sourness. This is a float between 0 and 1 (where 0 = not sour, 1 = extremely sour). The candy nuttiness. This is a float between 0 and 1 (where 0 = no nuts, 1 = all nuts). The candy texture. This is a float between 0 and 1 (where 0 = smooth, 1 = hard and crunchy). The rating. This is a float between 0 and 5 to measure its popularity (e.g the "stars")

I would cluster candies by their sweetness, sourness, and nuttiness attributes. We want to keep 3 features so that we can visualize our results in 3D. I will find the average rating of each cluster and pick the centroid of that cluster as the desired candy I would manufacture.

Built With

python

Updates

Enoch Kumala started this project — May 01, 2020 10:55 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.