Say I have a pandas Series:
index | value ------------- 0 | 2 1 | 0 2 | 8 3 | 0 4 | 1 5 | 2 6 | 7 7 | 4 8 | 2 9 | 9 10 | 0 11 | 0
I have to get a series (or array) of subrange maximum values. For example, a subrange of 5. For the first element, the value should be max{2, 0, 8, 0, 1} = 8. The second value should be max{0, 8, 0, 1, 2} = 8.
Starting from the 8th element, there are less than 5 elements in the subrange. The value should just be the maximum of the remaining elements.
It should be like:
index | value ------------- 0 | 8 1 | 8 2 | 8 3 | 7 4 | 7 5 | 9 6 | 9 7 | 9 8 | 9 9 | 9 10 | 0 11 | 0
I know we can simply do this by iterating the Series. But as I know, that’s not quite efficient if we use iloc
or iterate by using iterrows()
. Is there any more efficient and elegant way to do this? I heard that vector operation should be very quick. But I haven’t found out how to use that.
Advertisement
Answer
You can check rolling
df['value'] = df['value'].iloc[::-1].rolling(5,min_periods=1).max() Out[158]: 0 8.0 1 8.0 2 8.0 3 7.0 4 7.0 5 9.0 6 9.0 7 9.0 8 9.0 9 9.0 10 0.0 11 0.0 Name: value, dtype: float64