I have below data frame:
a 100 200 200 b 20 30 40 c 400 50
Need help to calculate sum of values for each item and place it in 2nd column, which ideally should look like below:
a 500 100 200 200 b 90 20 30 40 c 450 400 50
Advertisement
Answer
If need sum by groups by column col converted to numeric use GroupBy.transform with repeated non numeric values by ffill:
s = pd.to_numeric(df['col'], errors='coerce')
mask = s.isna()
df.loc[mask, 'new'] = s.groupby(df['col'].where(mask).ffill()).transform('sum')
print (df)
col new
0 a 500.0
1 100 NaN
2 200 NaN
3 200 NaN
4 b 90.0
5 20 NaN
6 30 NaN
7 40 NaN
8 c 450.0
9 400 NaN
10 50 NaN
Or:
df['new'] = np.where(mask, new.astype(int), '')
print (df)
col new
0 a 500
1 100
2 200
3 200
4 b 90
5 20
6 30
7 40
8 c 450
9 400
10 50