In order to do K-fold validation I would like to use slice a numpy array such that a view of the original array is made but with every nth element removed.
For example:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
If n = 4
then the result would be
[1, 2, 4, 5, 6, 8, 9]
Note: the numpy requirement is due to this being used for a machine learning assignment where the dependencies are fixed.
Advertisement
Answer
Approach #1 with modulus
a[np.mod(np.arange(a.size),4)!=0]
Sample run –
In [255]: a Out[255]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [256]: a[np.mod(np.arange(a.size),4)!=0] Out[256]: array([1, 2, 3, 5, 6, 7, 9])
Approach #2 with masking
: Requirement as a view
Considering the views requirement, if the idea is to save on memory, we could store the equivalent boolean array that would occupy 8
times less memory on Linux system. Thus, such a mask based approach would be like so –
# Create mask mask = np.ones(a.size, dtype=bool) mask[::4] = 0
Here’s the memory requirement stat –
In [311]: mask.itemsize Out[311]: 1 In [312]: a.itemsize Out[312]: 8
Then, we could use boolean-indexing as a view –
In [313]: a Out[313]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [314]: a[mask] = 10 In [315]: a Out[315]: array([ 0, 10, 10, 10, 4, 10, 10, 10, 8, 10])
Approach #3 with NumPy array strides
: Requirement as a view
You can use np.lib.stride_tricks.as_strided
to create such a view given the length of the input array is a multiple of n
. If it’s not a multiple, it would still work, but won’t be a safe practice, as we would be going beyond the memory allocated for input array. Please note that the view thus created would be 2D
.
Thus, an implementaion to get such a view would be –
def skipped_view(a, n): s = a.strides[0] strided = np.lib.stride_tricks.as_strided return strided(a,shape=((a.size+n-1)//n,n),strides=(n*s,s))[:,1:]
Sample run –
In [50]: a = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]) # Input array In [51]: a_out = skipped_view(a, 4) In [52]: a_out Out[52]: array([[ 1, 2, 3], [ 5, 6, 7], [ 9, 10, 11]]) In [53]: a_out[:] = 100 # Let's prove output is a view indeed In [54]: a Out[54]: array([ 0, 100, 100, 100, 4, 100, 100, 100, 8, 100, 100, 100])