Skip to content
Advertisement

Group by 2 columns with calculation of quantile of 3rd numerical column

I have a df below as:

 day     mealtype    oz
Monday   Snack      .34
Monday   Snack      .43
Monday   Dinner     .31
Tuesday  Breakfast  .10
Monday   Dinner     .11
Tuesday  Dinner     .09
Sunday   Snack      .33
Sunday   Dinner     .01
Sunday   Lunch      .03

I have code below that calculates the % of each mealtype for each day

 df.groupby('day')['mealtype'].value_counts(normalize=True).to_frame('%').reset_index().round(1)

How can I tweak this code to give me the quantiles – p50 and p90 of the oz column, but grouped by mealtype day and mealtype as well?

Thanks!

Advertisement

Answer

You can try this

p50 = df.groupby(['day','mealtype']).agg(p50 = ('oz',lambda x: x.quantile(0.5))).reset_index()
p90 = df.groupby(['day','mealtype']).agg(p90 = ('oz',lambda x: x.quantile(0.9))).reset_index()
new_df=p50.merge(p90, on=['day','mealtype'])
new_df

or

new_df = df.groupby(['day','mealtype'])['oz'].agg(p50 = (lambda x: x.quantile(0.5)),
                                         p90 = (lambda x: x.quantile(0.9))).reset_index()

output

     day    mealtype    p50     p90
0   Monday  Dinner      0.210   0.290
1   Monday  Snack       0.385   0.421
2   Sunday  Dinner      0.010   0.010
3   Sunday  Lunch       0.030   0.030
4   Sunday  Snack       0.330   0.330
5   Tuesday Breakfast   0.100   0.100
6   Tuesday Dinner      0.090   0.090
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement