Dataframe is it is So i'd like to sort this df over the fehmi over the first occurences of the entries and they are grouped together then. The desired is because we saw trial first in df so we gather its entries together. Then we saw error so they are together and so on. I attempted with a groupby with

How to sort a dataframe with the first occurences of each unique element in a column?

Dataframe is

df = pd.DataFrame({"necmi": [0, 3, 14, 15, 2, 71, 8, 2, -1],
                   "fehmi": ["trial", "error", "manifest", "trial", "no", "only", "error", "no", "no"]})

JavaScript
​x
 
df = pd.DataFrame({"necmi": [0, 3, 14, 15, 2, 71, 8, 2, -1],
                   "fehmi": ["trial", "error", "manifest", "trial", "no", "only", "error", "no", "no"]})
​

it is

   necmi     fehmi
0      0     trial
1      3     error
2     14  manifest
3     15     trial
4      2        no
5     71      only
6      8     error
7      2        no
8     -1        no

JavaScript
 
   necmi     fehmi
    0     trial
    3     error
   14  manifest
   15     trial
    2        no
   71      only
    8     error
    2        no
   -1        no
​

So i’d like to sort this df over the fehmi over the first occurences of the entries and they are grouped together then. The desired is

   necmi     fehmi
0      0     trial
1     15     trial
2      3     error
3      8     error
4     14  manifest
5      2        no
6      2        no
7     -1        no
8     71      only

JavaScript
 
   necmi     fehmi
    0     trial
   15     trial
    3     error
    8     error
   14  manifest
    2        no
    2        no
   -1        no
   71      only
​

because we saw trial first in df so we gather its entries together. Then we saw error so they are together and so on.

I attempted with a groupby with its sort is False as it seemed natural but..

df.groupby("fehmi", sort=False)

JavaScript
 
df.groupby("fehmi", sort=False)
​

I imagine they are almost in the form I need but it is a “groupby object” and cannot get a form I need, but i tried this to get the groups as is

df.groupby("fehmi", sort=False).apply(lambda s: s)

JavaScript
 
df.groupby("fehmi", sort=False).apply(lambda s: s)
​

but it gives the original df back!

Answer

`factorize` + `argsort`

df.iloc[np.argsort(df['fehmi'].factorize()[0])]

JavaScript
 
df.iloc[np.argsort(df['fehmi'].factorize()[0])]
​

   necmi     fehmi
0      0     trial
3     15     trial
1      3     error
6      8     error
2     14  manifest
4      2        no
7      2        no
8     -1        no
5     71      only

JavaScript
 
   necmi     fehmi
    0     trial
   15     trial
    3     error
    8     error
   14  manifest
    2        no
    2        no
   -1        no
   71      only
​

Advertisement

Answer

factorize + argsort

`factorize` + `argsort`