My Dataset
- In numpy array
np.shape(data)
-> (6989, 4)stats.describe(data)
-> DescribeResult(nobs=6989, minmax=(array([0., 0., 0., 0.]), array([ 299.99, 86785. , 10997. , 13222. ])), mean=array([ 12.47994992, 3407.00243239, 27.23293747, 109.72370869]), variance=array([1.42652452e+02, 4.71755188e+07, 6.17027586e+04, 2.92787820e+05]), skewness=array([ 4.27783176, 4.50762479, 31.57678605, 15.68962365]), kurtosis=array([ 58.23586935, 27.33838487, 1163.74537023, 302.6384056 ]))stats.describe(clusterer.labels_)
-> DescribeResult(nobs=6989, minmax=(array([0., 0., 0., 0.]), array([ 299.99, 86785. , 10997. , 13222. ])), mean=array([ 12.47994992, 3407.00243239, 27.23293747, 109.72370869]), variance=array([1.42652452e+02, 4.71755188e+07, 6.17027586e+04, 2.92787820e+05]), skewness=array([ 4.27783176, 4.50762479, 31.57678605, 15.68962365]), kurtosis=array([ 58.23586935, 27.33838487, 1163.74537023, 302.6384056 ]))np.shape(clusterer.labels_)
-> (6989,)
Original Dataset
- https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html
np.shape(data_original)
-> (1797, 64)np.shape(clusterer.labels_)
-> 1797stats.describe(clusterer.labels_)
-> DescribeResult(nobs=1797, minmax=(-1, 9), mean=1.555370061213133, variance=9.243730890261299, skewness=0.8760784771049832, kurtosis=-0.4263956978117518)
CODE Original guide that I am following all code
color_palette = sns.color_palette('Paired', 12) cluster_colors = [color_palette[x] if x >= 0 else (0.5, 0.5, 0.5) for x in clusterer.labels_] cluster_member_colors = [sns.desaturate(x, p) for x, p in zip(cluster_colors, clusterer.probabilities_)] plt.scatter(*projection.T, s=20, linewidth=0, c=cluster_member_colors, alpha=0.25)
ERROR
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) <ipython-input-175-64c069b8643a> in <module> 2 cluster_colors = [color_palette[x] if x >= 0 3 else (0.5, 0.5, 0.5) ----> 4 for x in clusterer.labels_] 5 cluster_member_colors = [sns.desaturate(x, p) for x, p in zip(cluster_colors, clusterer.probabilities_)] 6 plt.scatter(*projection.T, <ipython-input-175-64c069b8643a> in <listcomp>(.0) 2 cluster_colors = [color_palette[x] if x >= 0 3 else (0.5, 0.5, 0.5) ----> 4 for x in clusterer.labels_] 5 cluster_member_colors = [sns.desaturate(x, p) for x, p in zip(cluster_colors, clusterer.probabilities_)] 6 plt.scatter(*projection.T, IndexError: list index out of range
Tried Solutions
- I have no nan values in my dataset I have tried
print(np.isnan( np.sum(clusterer.labels_)))
ant it wasFalse
- I can see here what is programmatically the problem that my code array starts with 0 numbering the elements. The issue is that the same code has been used with both mine and the original dataset. And it gives no error with the original dataset and it gives error with mine. – https://stackoverflow.com/a/1098660/10270590
Advertisement
Answer
The issue was solved by adding more colors. Ex.:
color_palette = sns.color_palette('Paired', 1000)