Skip to content
Advertisement

One Dataset causes “`IndexError: list index out of range“` while other runs perfectly

My Dataset

  • In numpy array
  • np.shape(data) -> (6989, 4)
  • stats.describe(data) -> DescribeResult(nobs=6989, minmax=(array([0., 0., 0., 0.]), array([ 299.99, 86785. , 10997. , 13222. ])), mean=array([ 12.47994992, 3407.00243239, 27.23293747, 109.72370869]), variance=array([1.42652452e+02, 4.71755188e+07, 6.17027586e+04, 2.92787820e+05]), skewness=array([ 4.27783176, 4.50762479, 31.57678605, 15.68962365]), kurtosis=array([ 58.23586935, 27.33838487, 1163.74537023, 302.6384056 ]))
  • stats.describe(clusterer.labels_) -> DescribeResult(nobs=6989, minmax=(array([0., 0., 0., 0.]), array([ 299.99, 86785. , 10997. , 13222. ])), mean=array([ 12.47994992, 3407.00243239, 27.23293747, 109.72370869]), variance=array([1.42652452e+02, 4.71755188e+07, 6.17027586e+04, 2.92787820e+05]), skewness=array([ 4.27783176, 4.50762479, 31.57678605, 15.68962365]), kurtosis=array([ 58.23586935, 27.33838487, 1163.74537023, 302.6384056 ]))
  • np.shape(clusterer.labels_) -> (6989,)

Original Dataset

CODE Original guide that I am following all code

color_palette = sns.color_palette('Paired', 12)
cluster_colors = [color_palette[x] if x >= 0
                  else (0.5, 0.5, 0.5)
                  for x in clusterer.labels_]
cluster_member_colors = [sns.desaturate(x, p) for x, p in zip(cluster_colors, clusterer.probabilities_)]
plt.scatter(*projection.T, 
            s=20, 
            linewidth=0, 
            c=cluster_member_colors, 
            alpha=0.25)

ERROR

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-175-64c069b8643a> in <module>
      2 cluster_colors = [color_palette[x] if x >= 0
      3                   else (0.5, 0.5, 0.5)
----> 4                   for x in clusterer.labels_]
      5 cluster_member_colors = [sns.desaturate(x, p) for x, p in zip(cluster_colors, clusterer.probabilities_)]
      6 plt.scatter(*projection.T, 

<ipython-input-175-64c069b8643a> in <listcomp>(.0)
      2 cluster_colors = [color_palette[x] if x >= 0
      3                   else (0.5, 0.5, 0.5)
----> 4                   for x in clusterer.labels_]
      5 cluster_member_colors = [sns.desaturate(x, p) for x, p in zip(cluster_colors, clusterer.probabilities_)]
      6 plt.scatter(*projection.T, 

IndexError: list index out of range

Tried Solutions

  • I have no nan values in my dataset I have tried print(np.isnan( np.sum(clusterer.labels_))) ant it was False
  • I can see here what is programmatically the problem that my code array starts with 0 numbering the elements. The issue is that the same code has been used with both mine and the original dataset. And it gives no error with the original dataset and it gives error with mine. – https://stackoverflow.com/a/1098660/10270590

Advertisement

Answer

The issue was solved by adding more colors. Ex.:

color_palette = sns.color_palette('Paired', 1000)

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement