Skip to content
Advertisement

Clustering images using unsupervised Machine Learning

I have a database of images that contains identity cards, bills and passports.
I want to classify these images into different groups (i.e identity cards, bills and passports).
As I read about that, one of the ways to do this task is clustering (since it is going to be unsupervised).
The idea for me is like this: the clustering will be based on the similarity between images (i.e images that have similar features will be grouped together).
I know also that this process can be done by using k-means.
So the problem for me is about features and using images with K-means.
If anyone has done this before, or has a clue about it, please would you recommend some links to start with or suggest any features that can be helpful.

Advertisement

Answer

Label a few examples, and use classification.

Clustering is as likely to give you the clusters “images with a blueish tint”, “grayscale scans” and “warm color temperature”. That is a quote reasonable way to cluster such images.

Furthermore, k-means is very sensitive to outliers. And you probably have some in there.

Since you want your clusters correspond to certain human concepts, classification is what you need to use.

Advertisement