Tag: k-means

How to put importance coefficients to features before kmeans?

k-means machine-learning pca python scikit-learn

Lets say I have the given dataframe And I would like to find clusters in these rows. To do so, I want to use Kmeans. However, I would like to find clusters by giving more importance to [feature_1, feature_2] than to the other features in the dataframe. Lets say an importance coefficient of 0.5 for [feature_1, feature_2] , and 0.5

How to print KMeans intiatial parameters?

k-means python python-3.x scikit-learn

I am using PyCharm to run Kmeans using Iris data. When I run this, simply prints KMeans() But I would like it to print the following: How can this be accomplished? Answer Simply run kmeans.get_params(). This will print out the parameters (default or custom) used while instantiating the function in a dictionary format. Please refer this link for more information.

KMeans clustering from all possible combinations of 2 columns not producing correct output

cluster-analysis k-means matplotlib pandas python

I have a 4 column dataframe which I extracted from the iris dataset. I use kmeans to plot 3 clusters from all possible combinations of 2 columns. However, there seems to be something wrong with the output, especially since the cluster centers are not placed at the center of the clusters. I have provided examples of the output. Only cluster_1

Python: Convert a pandas Series into an array and keep the index

arrays k-means numpy pandas python

I’m running a k-means algorithm (k=5) to cluster my Data. To check the stability of my algorithm, I first run the algorithm once on my whole dataset and afterwards I run the algorithm multiple times on 2/3 of my dataset (using a different random states for the splits). I use the results to predict the cluster of the remaining 1/3

Clustering images using unsupervised Machine Learning

cluster-analysis computer-vision k-means python unsupervised-learning

I have a database of images that contains identity cards, bills and passports. I want to classify these images into different groups (i.e identity cards, bills and passports). As I read about that, one of the ways to do this task is clustering (since it is going to be unsupervised). The idea for me is like this: the clustering will

Kmean clustering top terms in cluster

cluster-analysis k-means python scikit-learn

I am using python Kmean clustering algorithm for cluster document. I have created a term-document matrix Then I applied Kmean clustering using following code My next task is to see the top terms in every cluster, searching on googole suggested that many of the people has used the km.cluster_centers_.argsort()[:, ::-1] for finding the top term in the clusters using the

How to use Scikit kmeans when I have a dataframe

k-means python scikit-learn

I have converted my dataset to dataframe. I was wondering how to use it in scikit kmeans or if any other kmeans package available. Answer sklearn is fully compatible with pandas DataFrames. Therefore, it’s as simple as: That 0.6 means you use 60% of your data for training, 40% for testing. More info here: http://scikit-learn.org/stable/modules/generated/sklearn.cross_validation.train_test_split.html http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html