i have a df which I want to add a column that shows the student who is place (1) from the group(‘subject’, ‘class’) and update the column after there is a new place (1).
code:
data = { "subject": ['eng','math','math','math','math','math','math','math','math','math','math','math','math','eng','eng'], "class": ['Class_4','Class_3','Class_3','Class_3','Class_3','Class_3','Class_3','Class_3','Class_3', 'Class_3','Class_3','Class_3','Class_3','Class_4','Class_4'], "student": ['henry','pan','larry','larry','henry','larry','terry','henry','henry', 'henry','pan','pan','moose','pan','moose'], "place": [7,8,10,1,7,10,9,7,11,1,11,3,6,2,4]} df = pd.DataFrame(data)
╔═════════╦═════════╦═════════╦═══════╗ ║ subject ║ class ║ student ║ place ║ ╠═════════╬═════════╬═════════╬═══════╣ ║ eng ║ Class_4 ║ henry ║ 7 ║ ║ math ║ Class_3 ║ pan ║ 8 ║ ║ math ║ Class_3 ║ larry ║ 10 ║ ║ math ║ Class_3 ║ larry ║ 1 ║ ║ math ║ Class_3 ║ henry ║ 7 ║ ║ math ║ Class_3 ║ larry ║ 10 ║ ║ math ║ Class_3 ║ terry ║ 9 ║ ║ math ║ Class_3 ║ henry ║ 7 ║ ║ math ║ Class_3 ║ henry ║ 11 ║ ║ math ║ Class_3 ║ henry ║ 1 ║ ║ math ║ Class_3 ║ pan ║ 11 ║ ║ math ║ Class_3 ║ pan ║ 3 ║ ║ math ║ Class_3 ║ moose ║ 6 ║ ║ eng ║ Class_4 ║ pan ║ 2 ║ ║ eng ║ Class_4 ║ moose ║ 4 ║ ╚═════════╩═════════╩═════════╩═══════╝
result trying to get
╔═════════╦═════════╦═════════╦═══════╦═════════╗ ║ subject ║ class ║ student ║ place ║ new_col ║ ╠═════════╬═════════╬═════════╬═══════╬═════════╣ ║ eng ║ Class_4 ║ henry ║ 7 ║ nil ║ ║ math ║ Class_3 ║ pan ║ 8 ║ nil ║ ║ math ║ Class_3 ║ larry ║ 10 ║ nil ║ ║ math ║ Class_3 ║ larry ║ 1 ║ nil ║ ║ math ║ Class_3 ║ henry ║ 7 ║ larry ║ ║ math ║ Class_3 ║ larry ║ 10 ║ larry ║ ║ math ║ Class_3 ║ terry ║ 9 ║ larry ║ ║ math ║ Class_3 ║ henry ║ 7 ║ larry ║ ║ math ║ Class_3 ║ henry ║ 11 ║ larry ║ ║ math ║ Class_3 ║ henry ║ 1 ║ larry ║ ║ math ║ Class_3 ║ pan ║ 11 ║ henry ║ ║ math ║ Class_3 ║ pan ║ 3 ║ henry ║ ║ math ║ Class_3 ║ moose ║ 6 ║ henry ║ ║ eng ║ Class_4 ║ pan ║ 2 ║ nil ║ ║ eng ║ Class_4 ║ moose ║ 4 ║ nil ║ ╚═════════╩═════════╩═════════╩═══════╩═════════╝
kindly advise. thank you
Advertisement
Answer
Mask the values in place
column which are not equal to 1
then group
the masked column by subject
and class
and use ffill
to forward fill the values
df['new_col'] = df['student'].mask(df['place'] != 1) df['new_col'] = df.groupby(['subject', 'class'])['new_col'].ffill()
print(df) subject class student place new_col 0 eng Class_4 henry 7 NaN 1 math Class_3 pan 8 NaN 2 math Class_3 larry 10 NaN 3 math Class_3 larry 1 larry 4 math Class_3 henry 7 larry 5 math Class_3 larry 10 larry 6 math Class_3 terry 9 larry 7 math Class_3 henry 7 larry 8 math Class_3 henry 11 larry 9 math Class_3 henry 1 henry 10 math Class_3 pan 11 henry 11 math Class_3 pan 3 henry 12 math Class_3 moose 6 henry 13 eng Class_4 pan 2 NaN 14 eng Class_4 moose 4 NaN