Skip to content
Advertisement

How to remove numpy columns based on condition?

I have a numpy array which contains the correlation between a label column

[0.5 -0.02 0.2]

And also a numpy array containing

[[0.42 0.35 0.6]
 [0.3  0.34 0.2]]

Can I use a function to determine which columns to keep?

Such as

abs(cors) > 0.05

It will yield

[True False True]

then the resulting numpy array will becomes

[[0.42 0.6]
 [0.3  0.2]]

May I know how to achieve this?

Advertisement

Answer

You can do boolean indexing along values with something like this:

a = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
b = np.array([
    [True, False, True],
    [False, True, False]
])
new_a = a[b]

Or, to do boolean indexing along rows/columns, use this syntax:

a = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
b = np.array([True, False, True])
c = np.array([False, True])
new_a = a[c, b]

So, for your example you could do:

a = np.array([
    [0.42, 0.35, 0.6],
    [0.3, 0.34, 0.2]
])
cors = np.array([0.5, -0.02, 0.2])
new_a = a[:, abs(cors) > 0.05]
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement