Why Python’s scikit-learn K-Means text clustering algorithm always provides different retult

Question

I have a list of documents and this class to perform actions on that list. So, basically, morphed_documents is a list of strings. And at the end, the algorithm returns the cluster for each document. But why the results and model&#8217;s labels are not the same though? Answer K-Means algorithm starts with a ra…

Accepted Answer

K-Means algorithm starts with a random initialization of cluster centroids. This selection will be different each time you run KMeans and may produce different results. In order to get reproducible results, you can use random_state argument in KMeans, which will fix the initial selection of cluster centeroids:model = KMeans(n_clusters=number_of_clusters,                init='k-means++',                max_iter=100,                n_init=100,                random_state=123)

Advertisement

Answer