Say, I have a numpy array consists of 10
elements, for example:
a = np.array([2, 23, 15, 7, 9, 11, 17, 19, 5, 3])
Now I want to efficiently set all a
values higher than 10
to 0
, so I’ll get:
[2, 0, 0, 7, 9, 0, 0, 0, 5, 3]
Because I currently use a for
loop, which is very slow:
# Zero values below "threshold value". def flat_values(sig, tv): """ :param sig: signal. :param tv: threshold value. :return: """ for i in np.arange(np.size(sig)): if sig[i] < tv: sig[i] = 0 return sig
How can I achieve that in the most efficient way, having in mind big arrays of, say, 10^6
elements?
Advertisement
Answer
Generally, list comprehensions are faster than for
loops in python (because python knows that it doesn’t need to care for a lot of things that might happen in a regular for
loop):
a = [0 if a_ > thresh else a_ for a_ in a]
but, as @unutbu correctly pointed out, numpy allows list indexing, and element-wise comparison giving you index lists, so:
super_threshold_indices = a > thresh a[super_threshold_indices] = 0
would be even faster.
Generally, when applying methods on vectors of data, have a look at numpy.ufuncs
, which often perform much better than python functions that you map using any native mechanism.