Selecting first row from each subgroup (pandas)

Question

How to select the subset of rows where distance is lowest, grouping by date and p columns? Ideally, the returned dataframe should contain: Answer One way is to use groupby + idxmin to get the index of the smallest distance per group, then use loc to get the desired output: Output:

Accepted Answer

One way is to use groupby + idxmin to get the index of the smallest distance per group, then use loc to get the desired output:out = df.loc[df.groupby(['date', 'p'])['distance'].idxmin()]Output:       v     p  distance        date0  14.60   sst   22454.1  2021-12-303   1.67  wvht   23141.8  2021-12-306   1.70  wvht   23141.4  2021-12-31

Advertisement

Answer