Dataframe is
JavaScript
x
3
1
df = pd.DataFrame({"necmi": [0, 3, 14, 15, 2, 71, 8, 2, -1],
2
"fehmi": ["trial", "error", "manifest", "trial", "no", "only", "error", "no", "no"]})
3
it is
JavaScript
1
11
11
1
necmi fehmi
2
0 0 trial
3
1 3 error
4
2 14 manifest
5
3 15 trial
6
4 2 no
7
5 71 only
8
6 8 error
9
7 2 no
10
8 -1 no
11
So i’d like to sort this df over the fehmi
over the first occurences of the entries and they are grouped together then. The desired is
JavaScript
1
11
11
1
necmi fehmi
2
0 0 trial
3
1 15 trial
4
2 3 error
5
3 8 error
6
4 14 manifest
7
5 2 no
8
6 2 no
9
7 -1 no
10
8 71 only
11
because we saw trial
first in df so we gather its entries together. Then we saw error
so they are together and so on.
I attempted with a groupby
with its sort
is False as it seemed natural but..
JavaScript
1
2
1
df.groupby("fehmi", sort=False)
2
I imagine they are almost in the form I need but it is a “groupby object” and cannot get a form I need, but i tried this to get the groups as is
JavaScript
1
2
1
df.groupby("fehmi", sort=False).apply(lambda s: s)
2
but it gives the original df back!
Advertisement
Answer
factorize
+ argsort
JavaScript
1
2
1
df.iloc[np.argsort(df['fehmi'].factorize()[0])]
2
JavaScript
1
11
11
1
necmi fehmi
2
0 0 trial
3
3 15 trial
4
1 3 error
5
6 8 error
6
2 14 manifest
7
4 2 no
8
7 2 no
9
8 -1 no
10
5 71 only
11