Skip to content
Advertisement

How do I do a groupby in python to split between orders?

I have a dataframe that shows order listing. How do I use it to find the number of orders that contain spicy food?

   Order       Item Name Spicy
0   1000  Calamari Rings    No
1   1001      Cheesecake    No
2   1001   Spicy Chicken    Yes
3   1002      Spicy Lamb    Yes

Using this code gives me 2 Yes and 2 No, but it should actually be 2 Yes and 1 No as order 1001 is duplicated. Thank you.

dd = df_orders.groupby("Item Name")["Order"].count()

I would like to get an output that shows the number of orders that contains spicy food.

   Spicy        
0    Yes   2
1     No   1

Advertisement

Answer

There’s perhaps a simpler way, but this works.

First group by order and spicy counts, to get the count of spicy for each order. Then sort by spicy and drop duplicates by order number (removes the ‘No’ in spicy column if a yes exists for that order). Then group by Spicy again and count to get the counts.

df_orders = pd.DataFrame({'Order' : [1000, 1001, 1001, 1002],
                          'Item Name' : ['Calamari Rings', 'Cheesecake', 'Spicy Chicken', 'Spicy Lamb'],
                          'Spicy' : ['No', 'No', 'Yes', 'Yes']})

df_grouped = df_orders.groupby(['Order', 'Spicy']).count().reset_index()
df_grouped = df_grouped.sort_values(by='Spicy').drop_duplicates(subset='Order', keep='last')
df_grouped = df_grouped.groupby('Spicy').count()['Order'].reset_index()

Output:

  Spicy  Order
0    No      1
1   Yes      2
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement