Skip to content
Advertisement

How to transform a dataframe to have values of a column as new columns, grouped by other columns?

I have an Excel file as below

Excel Raw File

I want to plot the output by groupby Name and Fruit on the left hand side and top groupby date as shown as below.

Outcome Excel

I tried create the Excel file like this:

Output = dt.groupby(['Name','Fruits'])['QTY'].sum()

but I do not know how does the QTY can split by the Date on top of it.

Advertisement

Answer

UPDATE

If your indexes are duplicated, use pivot_table() instead and use an aggregation function like this (see the note at the end of this section):

df.pivot_table(index=['Name', 'Fruits'], columns=['Date'], values='QTY', aggfunc='sum', fill_value=0)

That gives me the following results: enter image description here

ORIGINAL

If using Pandas, you can pivot the DataFrame as such:

df.pivot(index=['Name', 'Fruits'], columns=['Date'], values='QTY').fillna(0)

The fillna() call places zeroes where the table has NaNs. If you don’t mind that, you might as well just remove that function call.

User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement