I have tens of thousands of images. I want to generate a histogram for each pixel. I have come up with the following code using NumPy to do this that works:
JavaScript
x
33
33
1
import numpy as np
2
import matplotlib.pyplot as plt
3
4
nimages = 1000
5
im_shape = (64,64)
6
nbins = 100
7
#predefine the histogram bins
8
hist_bins = np.linspace(0,1,nbins)
9
#create an array to store histograms for each pixel
10
perpix_hist = np.zeros((64,64,nbins))
11
12
for ni in range(nimages):
13
#create a simple image with normally distributed pixel values
14
im = np.random.normal(loc=0.5,scale=0.05,size=im_shape)
15
16
#sort each pixel into the predefined histogram
17
bins_for_this_image = np.searchsorted(hist_bins, im.ravel())
18
bins_for_this_image = bins_for_this_image.reshape(im_shape)
19
20
#this next part adds one to each of those bins
21
#but this is slow as it loops through each pixel
22
#how to vectorize?
23
for i in range(im_shape[0]):
24
for j in range(im_shape[1]):
25
perpix_hist[i,j,bins_for_this_image[i,j]] += 1
26
27
#plot histogram for a single pixel
28
plt.plot(hist_bins,perpix_hist[0,0])
29
plt.xlabel('pixel values')
30
plt.ylabel('counts')
31
plt.title('histogram for a single pixel')
32
plt.show()
33
I would like to know if anyone can help me vectorize the for loops? I can’t think of how to index into the perpix_hist array properly. I have tens/hundreds of thousands of images and each image is ~1500×1500 pixels, and this is too slow.
Advertisement
Answer
You can vectorize it using np.meshgrid
and providing indices for first, second and third dimension (the last dimension you already have).
JavaScript
1
12
12
1
y_grid, x_grid = np.meshgrid(np.arange(64), np.arange(64))
2
3
for i in range(nimages):
4
#create a simple image with normally distributed pixel values
5
im = np.random.normal(loc=0.5,scale=0.05,size=im_shape)
6
7
#sort each pixel into the predefined histogram
8
bins_for_this_image = np.searchsorted(hist_bins, im.ravel())
9
bins_for_this_image = bins_for_this_image.reshape(im_shape)
10
11
perpix_hist[x_grid, y_grid, bins_for_this_image] += 1
12