How can I efficiently find distances from each value to the next lower/higher value?

Question

I'll tell you what structures I am using, please feel free to recommend any changes like numpy arrays or something. Anyways what I have is a list of 5 million sequential entries that correspond to a stock price. I then have 2 more lists, each of these is the same length - 5 million entries. These lists correspond to an

Accepted Answer

You can solve this efficiently (in O(N log N) time) using a data structure similar to what is called a “monotonic queue”. You can google that, but the usual use case is pretty different from yours, so I’ll just explain. (strangely, this is the 3rd question I’ve seen here in a week where the answer needs a structure like this.)In your case, you will work from the end of the prices array toward the start, adding each price to the front of the monotonic queue. Every time you put a price in, some others may be discarded, so that the queue only holds items that are larger than all the previous ones. These are the the only items that could possibly be the ‘next higher price’. They are also monotonically increasing in the queue, so you can find the first one >= the target using a binary search. Since you need to know the index of the next higher value, you can store the indexes instead of the values themselves.That solves the upper limit problem. The lower limit is the similar, but with the queue monotonically decreasing instead.In python, it looks like this:def solve_upper(prices, limits): solved = [0]*len(prices) q = [0]*len(prices) qstart = len(q) for i in range(len(prices)-1, -1, -1): price = prices[i] while qstart < len(q) and prices[q[qstart]] <= price: # the price at the start of q needs to be discarded, since # it isn't greater than the new one qstart += 1 # prepend the new price qstart -= 1 q[qstart] = i limit = limits[i] # binary search to find the first price >= limit minpos = qstart maxpos = len(q) while minpos < maxpos: testpos = minpos + (maxpos - minpos)//2 if prices[q[testpos]] < limit: # too low minpos = testpos+1 else: # high enough maxpos = testpos if minpos < len(q): solved[i] = q[minpos]-i else: solved[i] = None return solveddef solve_lower(prices, limits): solved = [0]*len(prices) q = [0]*len(prices) qstart = len(q) for i in range(len(prices)-1, -1, -1): price = prices[i] while qstart < len(q) and prices[q[qstart]] >= price: # the price at the start of q needs to be discarded, since # it isn't less than the new one qstart += 1 # prepend the new price qstart -= 1 q[qstart] = i limit = limits[i] # binary search to find the first price <= limit minpos = qstart maxpos = len(q) while minpos < maxpos: testpos = minpos + (maxpos - minpos)//2 if prices[q[testpos]] > limit: # too low minpos = testpos+1 else: # high enough maxpos = testpos if minpos < len(q): solved[i] = q[minpos]-i else: solved[i] = None return solvedprices = [15,16,18,22,23,17,15,19,15,18]upper_limits = [17,18,21,23,25,22,18,21,18,20]lower_limits = [14,15,16,18,19,15,13,17,14,16]print(solve_upper(prices, upper_limits))print(solve_lower(prices, lower_limits))Output:[2, 1, 1, 1, None, None, 1, None, 1, None][None, 5, 4, 2, 1, 1, None, 1, None, None]NOTE: If you benchmark this answer against @btilly’s, please include the results in a comment!

Advertisement

Answer