Skip to content
Advertisement

conditional groupby and update column – python, pandas, groupby

i have a df which I want to add a column that shows the student who is place (1) from the group(‘subject’, ‘class’) and update the column after there is a new place (1).

code:

data = {
    "subject": ['eng','math','math','math','math','math','math','math','math','math','math','math','math','eng','eng'],
    "class": ['Class_4','Class_3','Class_3','Class_3','Class_3','Class_3','Class_3','Class_3','Class_3',
              'Class_3','Class_3','Class_3','Class_3','Class_4','Class_4'],
    "student": ['henry','pan','larry','larry','henry','larry','terry','henry','henry',
                'henry','pan','pan','moose','pan','moose'],
    "place": [7,8,10,1,7,10,9,7,11,1,11,3,6,2,4]}

df = pd.DataFrame(data)
╔═════════╦═════════╦═════════╦═══════╗
║ subject ║  class  ║ student ║ place ║
╠═════════╬═════════╬═════════╬═══════╣
║ eng     ║ Class_4 ║ henry   ║     7 ║
║ math    ║ Class_3 ║ pan     ║     8 ║
║ math    ║ Class_3 ║ larry   ║    10 ║
║ math    ║ Class_3 ║ larry   ║     1 ║
║ math    ║ Class_3 ║ henry   ║     7 ║
║ math    ║ Class_3 ║ larry   ║    10 ║
║ math    ║ Class_3 ║ terry   ║     9 ║
║ math    ║ Class_3 ║ henry   ║     7 ║
║ math    ║ Class_3 ║ henry   ║    11 ║
║ math    ║ Class_3 ║ henry   ║     1 ║
║ math    ║ Class_3 ║ pan     ║    11 ║
║ math    ║ Class_3 ║ pan     ║     3 ║
║ math    ║ Class_3 ║ moose   ║     6 ║
║ eng     ║ Class_4 ║ pan     ║     2 ║
║ eng     ║ Class_4 ║ moose   ║     4 ║
╚═════════╩═════════╩═════════╩═══════╝

result trying to get

╔═════════╦═════════╦═════════╦═══════╦═════════╗
║ subject ║  class  ║ student ║ place ║ new_col ║
╠═════════╬═════════╬═════════╬═══════╬═════════╣
║ eng     ║ Class_4 ║ henry   ║     7 ║ nil     ║
║ math    ║ Class_3 ║ pan     ║     8 ║ nil     ║
║ math    ║ Class_3 ║ larry   ║    10 ║ nil     ║
║ math    ║ Class_3 ║ larry   ║     1 ║ nil     ║
║ math    ║ Class_3 ║ henry   ║     7 ║ larry   ║
║ math    ║ Class_3 ║ larry   ║    10 ║ larry   ║
║ math    ║ Class_3 ║ terry   ║     9 ║ larry   ║
║ math    ║ Class_3 ║ henry   ║     7 ║ larry   ║
║ math    ║ Class_3 ║ henry   ║    11 ║ larry   ║
║ math    ║ Class_3 ║ henry   ║     1 ║ larry   ║
║ math    ║ Class_3 ║ pan     ║    11 ║ henry   ║
║ math    ║ Class_3 ║ pan     ║     3 ║ henry   ║
║ math    ║ Class_3 ║ moose   ║     6 ║ henry   ║
║ eng     ║ Class_4 ║ pan     ║     2 ║ nil     ║
║ eng     ║ Class_4 ║ moose   ║     4 ║ nil     ║
╚═════════╩═════════╩═════════╩═══════╩═════════╝

kindly advise. thank you

Advertisement

Answer

Mask the values in place column which are not equal to 1 then group the masked column by subject and class and use ffill to forward fill the values

df['new_col'] = df['student'].mask(df['place'] != 1)
df['new_col'] = df.groupby(['subject', 'class'])['new_col'].ffill()
print(df)

   subject    class student  place new_col
0      eng  Class_4   henry      7     NaN
1     math  Class_3     pan      8     NaN
2     math  Class_3   larry     10     NaN
3     math  Class_3   larry      1   larry
4     math  Class_3   henry      7   larry
5     math  Class_3   larry     10   larry
6     math  Class_3   terry      9   larry
7     math  Class_3   henry      7   larry
8     math  Class_3   henry     11   larry
9     math  Class_3   henry      1   henry
10    math  Class_3     pan     11   henry
11    math  Class_3     pan      3   henry
12    math  Class_3   moose      6   henry
13     eng  Class_4     pan      2     NaN
14     eng  Class_4   moose      4     NaN
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement