I have the following dataframe:
Location | Student Name |
---|---|
D | Amy |
D | Raj |
E | Mitch |
F | Poo |
F | Mack |
I am trying to generate the following dataframe:
Location | Student Name |
---|---|
D | Amy |
D | Raj |
Total Students at D | 2 |
E | Mitch |
Total Students at E | 1 |
F | Poo |
F | Mack |
Total Students at F | 2 |
Grand Total | 5 |
How do I do that?
Advertisement
Answer
I will offer a solution without loops.
df = pd.DataFrame({'Location':['D','D','E','F','F'], 'Student Name':['Amy', 'Raj', 'Mitch', 'Poo', 'Mack']}) df1 = df.groupby('Location', as_index = False).agg({'Student Name':'count'}) df1['Location'] = df1['Location'].apply(lambda row : row + 'Total' ) df2 = pd.concat([df, df1]).sort_values(by = 'Location') df2['Location'] = df2['Location'].apply(lambda x : 'Total Students at ' + x[:len(x)-len('Total')] if x.endswith('Total') else x) df2 = df2.reset_index() df2.drop(['index'], axis = 1, inplace = True) df2 = df2.append({'Location' : 'Grand Total', 'Student Name' : df1['Student Name'].sum()}, ignore_index = True)
Output :
df2 Location Student Name 0 D Amy 1 D Raj 2 Total Students at D 2 3 E Mitch 4 Total Students at E 1 5 F Poo 6 F Mack 7 Total Students at F 2 8 Grand Total 5