Dataframe is it is So i'd like to sort this df over the fehmi over the first occurences of the entries and they are grouped together then. The desired is because we saw trial first in df so we gather its entries together. Then we saw error so they are together and so on. I attempted with a groupby with

How to sort a dataframe with the first occurences of each unique element in a column?

Dataframe is

df = pd.DataFrame({"necmi": [0, 3, 14, 15, 2, 71, 8, 2, -1],
                   "fehmi": ["trial", "error", "manifest", "trial", "no", "only", "error", "no", "no"]})

it is

   necmi     fehmi
0      0     trial
1      3     error
2     14  manifest
3     15     trial
4      2        no
5     71      only
6      8     error
7      2        no
8     -1        no

So i’d like to sort this df over the fehmi over the first occurences of the entries and they are grouped together then. The desired is

   necmi     fehmi
0      0     trial
1     15     trial
2      3     error
3      8     error
4     14  manifest
5      2        no
6      2        no
7     -1        no
8     71      only

because we saw trial first in df so we gather its entries together. Then we saw error so they are together and so on.

I attempted with a groupby with its sort is False as it seemed natural but..

df.groupby("fehmi", sort=False)

I imagine they are almost in the form I need but it is a “groupby object” and cannot get a form I need, but i tried this to get the groups as is

df.groupby("fehmi", sort=False).apply(lambda s: s)

but it gives the original df back!

Answer

`factorize` + `argsort`

df.iloc[np.argsort(df['fehmi'].factorize()[0])]

   necmi     fehmi
0      0     trial
3     15     trial
1      3     error
6      8     error
2     14  manifest
4      2        no
7      2        no
8     -1        no
5     71      only

Advertisement

Answer

factorize + argsort

`factorize` + `argsort`