Data
client total avail used date charles 5000 5000 0 2/1/2022 charles 8000 5000 3000 2/1/2022 charles 500 500 0 3/1/2022 cara 1000 100 900 3/1/2022 cara 2000 50 1550 2/1/2022 cara 500 0 500 2/1/2022 cara 100 0 100 3/1/2022
Desired
client used date charles 0.23 2/1/2022 charles 0 3/1/2022 cara 0.84 2/1/2022 cara 1 3/1/2022
Doing
Groupby client and date; create calculated column ['used']/['total'] * 100
first create derived column
df["used"] = df["used)"]/['total'] * 100 df.groupby(['client','date']) .agg({'used':'max'}).reset_index() df["used"] = df["used)"]/['total'] * 100
Any suggestion is helpful
Advertisement
Answer
You aren’t using a correct aggregation function. You should be using sum
on both your “used” and “total” columns:
df = pd.DataFrame( { 'client': ['charles', 'charles', 'charles', 'cara', 'cara', 'cara', 'cara'], 'total': [5000, 8000, 500, 1000, 2000, 500, 100], 'used': [0, 3000, 0, 900, 1550, 500, 100], 'date': ['2/1/2022', '2/1/2022', '3/1/2022', '3/1/2022', '2/1/2022', '2/1/2022', '3/1/2022'] } ) tmp = df.groupby(['client', 'date']).agg({'used': 'sum', 'total': 'sum'}) tmp['used_frac'] = tmp['used'] / tmp['total'] tmp.reset_index(inplace=True) tmp # Out: client date used total used_frac 0 cara 2/1/2022 2050 2500 0.820000 1 cara 3/1/2022 1000 1100 0.909091 2 charles 2/1/2022 3000 13000 0.230769 3 charles 3/1/2022 0 500 0.000000