I have the following dataframe:
JavaScript
x
16
16
1
df = pd.DataFrame({
2
'tmp': ['A', 'A', 'B', 'Z', 'D', 'C'],
3
'F1': [2, 1, 9, 8, 7, 4],
4
'F20': [0, 1, 9, 4, 2, 3],
5
'F3': ['a', 'B', 'c', 'D', 'e', 'F'],
6
'aabb': ['a', 'B', 'c', 'D', 'e', 'F']
7
})
8
---
9
tmp F1 F20 F3 aabb
10
0 A 2 0 a a
11
1 A 1 1 B B
12
2 B 9 9 c c
13
3 Z 8 4 D D
14
4 D 7 2 e e
15
5 C 4 3 F F
16
and I would like to sort only the columns with the F in this way:
JavaScript
1
8
1
tmp F1 F3 F20 aabb
2
0 A 2 a 0 a
3
1 A 1 B 1 B
4
2 B 9 c 9 c
5
3 Z 8 D 4 D
6
4 D 7 e 2 e
7
5 C 4 F 3 F
8
How could I do?
(edit) The columns with the “F” can vary both in quantity and in the values that follow the F (in my case I have about 100 columns like those) The columns with F are always grouped but the number before and after is variable
Advertisement
Answer
You can use natsort
for natural sorting and a mask to handle only the F columns:
JavaScript
1
9
1
# pip install natsort
2
from natsort import natsorted
3
4
cols = df.columns.to_numpy(copy=True)
5
m = df.columns.str.fullmatch('Fd+')
6
cols[m] = natsorted(cols[m])
7
8
df_sorted = df[cols]
9
Alternative without natsort
:
JavaScript
1
8
1
num = df.columns.str.extract('F(d+)', expand=False).astype(float)
2
cols = df.columns.to_numpy(copy=True)
3
m = num.notna()
4
order = np.argsort(num[m])
5
cols[m] = cols[m][order]
6
7
df_sorted = df[cols]
8
output:
JavaScript
1
8
1
tmp F1 F3 F20 aabb
2
0 A 2 a 0 a
3
1 A 1 B 1 B
4
2 B 9 c 9 c
5
3 Z 8 D 4 D
6
4 D 7 e 2 e
7
5 C 4 F 3 F
8