I have a panda’s dataframe that is something like this
el1 el2 x el3 n m 6 f n2 m2 7 f2 .... n10 m10 19.3 f10 n11 m11 21 f11 ....
The el1, el2, and el3 do not matter at all. I want to find the row with the X nearest to x=20
so I do
min_index=df['x'].sub(20).abs().idxmin()
which gives me the index where x=19.3 is
So far so good
Now, the real problem
Imagine I have a dataframe where x goes from 6 to 31 and then again from 7 to 45 and then again from 2 to 27
I want a way to find a list of indexes [idx1, idx2...idxn]
(in the above example n would be 3) with the most approximate values to say for example 20.
How can I do this with pandas and python?
EDIT: An example of the df would be
el1 el2 x el3 n m 6 f n2 m2 7 f2 .... n10 m10 19.3 f10 n11 m11 21 f11 .... n20 m20 31 f20 n21 m21 7 f21 n22 m22 8.1 f22 .... n29 m29 19.8 f29 n30 m30 21 f30 ... n35 m35 45 f35 n36 m36 2 f36 n37 m37 3 f37 .... n45 m45 19.9 f45 n46 m46 22 f46 ... n50 m50 27 f50
The rows I want are the ones where x=19.3 x=19.8 x=x=19.9
Advertisement
Answer
Assuming you have consecutive stretches of increasing values and want to find the closest to 20 for each:
group = df['x'].diff().lt(0).cumsum() out = df.loc[df['x'].sub(20).abs().groupby(group).idxmin()]
example input:
import numpy as np df = pd.DataFrame({'x': np.r_[np.linspace(6,31,5), np.linspace(7,45,5), np.linspace(2,27,5), ], 'el1': '-' })
output:
x el1 2 18.50 - 6 16.50 - 13 20.75 -