I have the following dataframe:
Location | Student Name |
---|---|
D | Amy |
D | Raj |
E | Mitch |
F | Poo |
F | Mack |
I am trying to generate the following dataframe:
Location | Student Name |
---|---|
D | Amy |
D | Raj |
Total Students at D | 2 |
E | Mitch |
Total Students at E | 1 |
F | Poo |
F | Mack |
Total Students at F | 2 |
Grand Total | 5 |
How do I do that?
Advertisement
Answer
I will offer a solution without loops.
JavaScript
x
10
10
1
df = pd.DataFrame({'Location':['D','D','E','F','F'],
2
'Student Name':['Amy', 'Raj', 'Mitch', 'Poo', 'Mack']})
3
df1 = df.groupby('Location', as_index = False).agg({'Student Name':'count'})
4
df1['Location'] = df1['Location'].apply(lambda row : row + 'Total' )
5
df2 = pd.concat([df, df1]).sort_values(by = 'Location')
6
df2['Location'] = df2['Location'].apply(lambda x : 'Total Students at ' + x[:len(x)-len('Total')] if x.endswith('Total') else x)
7
df2 = df2.reset_index()
8
df2.drop(['index'], axis = 1, inplace = True)
9
df2 = df2.append({'Location' : 'Grand Total', 'Student Name' : df1['Student Name'].sum()}, ignore_index = True)
10
Output :
JavaScript
1
13
13
1
df2
2
3
Location Student Name
4
0 D Amy
5
1 D Raj
6
2 Total Students at D 2
7
3 E Mitch
8
4 Total Students at E 1
9
5 F Poo
10
6 F Mack
11
7 Total Students at F 2
12
8 Grand Total 5
13