Skip to content
Advertisement

KMeans clustering from all possible combinations of 2 columns not producing correct output

I have a 4 column dataframe which I extracted from the iris dataset. I use kmeans to plot 3 clusters from all possible combinations of 2 columns.

However, there seems to be something wrong with the output, especially since the cluster centers are not placed at the center of the clusters. I have provided examples of the output. Only cluster_1 seems OK but the other 3 look completely wrongenter image description here enter image description here enter image description here enter image description here.

How best can I fix my clustering? This is the sample code I am using

JavaScript

Dataset used:

JavaScript

Advertisement

Answer

You compute the clusters in four dimensions. Note this implies the centroids are four-dimensional points too. Then you plot two-dimensional projections of the clusters. So when you plot the centroids, you have to pick out the same two dimensions that you just used for the scatterplot of the individual points.

JavaScript
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement