Data
client total avail used date
charles 5000 5000 0 2/1/2022
charles 8000 5000 3000 2/1/2022
charles 500 500 0 3/1/2022
cara 1000 100 900 3/1/2022
cara 2000 50 1550 2/1/2022
cara 500 0 500 2/1/2022
cara 100 0 100 3/1/2022
Desired
client used date
charles 0.23 2/1/2022
charles 0 3/1/2022
cara 0.84 2/1/2022
cara 1 3/1/2022
Doing
Groupby client and date; create calculated column ['used']/['total'] * 100
first create derived column
df["used"] = df["used)"]/['total'] * 100
df.groupby(['client','date'])
.agg({'used':'max'}).reset_index()
df["used"] = df["used)"]/['total'] * 100
Any suggestion is helpful
Advertisement
Answer
You aren’t using a correct aggregation function. You should be using sum on both your “used” and “total” columns:
df = pd.DataFrame(
{
'client': ['charles', 'charles', 'charles', 'cara', 'cara', 'cara', 'cara'],
'total': [5000, 8000, 500, 1000, 2000, 500, 100],
'used': [0, 3000, 0, 900, 1550, 500, 100],
'date': ['2/1/2022', '2/1/2022', '3/1/2022', '3/1/2022', '2/1/2022', '2/1/2022', '3/1/2022']
}
)
tmp = df.groupby(['client', 'date']).agg({'used': 'sum', 'total': 'sum'})
tmp['used_frac'] = tmp['used'] / tmp['total']
tmp.reset_index(inplace=True)
tmp
# Out:
client date used total used_frac
0 cara 2/1/2022 2050 2500 0.820000
1 cara 3/1/2022 1000 1100 0.909091
2 charles 2/1/2022 3000 13000 0.230769
3 charles 3/1/2022 0 500 0.000000