Skip to content
Advertisement

Any easy way to transform a missing number sequence to its range?

Suppose I have a list that goes like : ”’ [1,2,3,4,9,10,11,20] ”’ I need the result to be like : ”’ [[4,9],[11,20]] ”’ I have defined a function that goes like this :

JavaScript

Is there any other easier and efficient way to do it? I need to do this in the range of millions.

Advertisement

Answer

You can use numpy arrays and the diff function that comes along with them. Numpy is so much more efficient than looping when you have millions of rows.


Slight aside: Why are numpy arrays so fast? Because they are arrays of data instead of arrays of pointers to data (which is what Python lists are), because they offload a whole bunch of computations to a backend written in C, and because they leverage the SIMD paradigm to run a Single Instruction on Multiple Data simultaneously.


Now back to the problem at hand:

The diff function gives us the difference between consecutive elements of the array. Pretty convenient, given that we need to find where this difference is greater than a known threshold!

JavaScript

You can access all_gaps like a 2D matrix: all_gaps[0, 1] would give you 9, for example. If you really need the answer as a list-of-lists, simply convert it like so:

JavaScript

Comparing the runtime of the iterative method from @happydave’s answer with the numpy method:

JavaScript

On my laptop, this gives:

JavaScript

My word that’s fast!

Advertisement