Given a dataframe, I want to get the nonzero values of each row and then find the minimum of absolute values. I want to have a user defined function that does this for me. Also, I do not want to use any for loop since the data is big.
My try
JavaScript
x
19
19
1
np.random.seed(5)
2
data = np.random.randn(16)
3
mask = np.random.permutation(16)[:6]
4
data[mask] = 0
5
df = pd.DataFrame(data.reshape(4,4))
6
7
0 1 2 3
8
0 0.441227 -0.330870 2.430771 0.000000
9
1 0.000000 1.582481 -0.909232 -0.591637
10
2 0.000000 -0.329870 -1.192765 0.000000
11
3 0.000000 0.603472 0.000000 -0.700179
12
13
14
def udf(x):
15
if x != 0:
16
x_min = x.abs().min()
17
return x_min
18
df.apply(udf, axis=1)
19
I get ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Question How can I solve the above?
The desired answer is the following:
JavaScript
1
5
1
0.330870
2
0.591637
3
0.329870
4
0.603472
5
Advertisement
Answer
You can use x.ne(0)
as boolean indexing to filter row
JavaScript
1
2
1
res = df.apply(lambda x: x[x.ne(0)].abs().min(), axis=1)
2
JavaScript
1
8
1
print(res)
2
3
0 0.330870
4
1 0.591637
5
2 0.329870
6
3 0.603472
7
dtype: float64
8
Or use min(axis=1)
JavaScript
1
2
1
res = df[df.ne(0)].abs().min(axis=1)
2