Skip to content
Advertisement

pandas pivot dataframe to 3d data

There seem to be a lot of possibilities to pivot flat table data into a 3d array but I’m somehow not finding one that works: Suppose I have some data with columns=[‘name’, ‘type’, ‘date’, ‘value’]. When I try to pivot via

pivot(index='name', columns=['type', 'date'], values='value')

I get

ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

Am I reading docs from dev pandas maybe? It seems like this is the usage described there. I am running 0.8 pandas.

I guess, I’m wondering if I have a MultiIndex [‘x’, ‘y’, ‘z’] Series, is there a pandas way to put that in a panel? I can use groupby and get the job done, but then this is almost like what I would do in numpy to assemble an n-d array. Seems like a fairly generic operation so I would imagine it might be implemented already.

Advertisement

Answer

pivot only supports using a single column to generate your columns. You probably want to use pivot_table to generate a pivot table using multiple columns e.g.

pandas.tools.pivot.pivot_table(your_dataframe, values='value', index='name', columns=['type', 'date'], aggfunc='sum')

The hierarchical columns that are mentioned in the API reference and documentation for pivot relates to cases where you have multiple value fields rather than multiple categories.

Assuming ‘type’ and ‘date’ are categories, whose values should be used as the column names, then you should use pivot_table.

However, if you want separate columns for different value fields for the same category (e.g. ‘type’), then you should use pivot without specifying the value column and your category as the columns parameter.

For example, suppose you have this DataFrame:

df = DataFrame({'name': ['A', 'B', 'A', 'B'], 'type': [1, 1, 2, 2], 'date': ['2012-01-01', '2012-01-01', '2012-02-01', '2012-02-01'],  'value': [1, 2, 3, 4]})

pt = df.pivot_table(values='value', index='name', columns=['type', 'date'])
p = df.pivot('name', 'type')

pt will be:

type           1           2
date  2012-01-01  2012-02-01
name                        
A              1           3
B              2           4

and p will be:

          date              value   
type           1           2      1  2
name                                  
A     2012-01-01  2012-02-01      1  3
B     2012-01-01  2012-02-01      2  4

NOTE: For pandas version < 0.14.0, the index and columns keyword arguments should be replaced with rows and cols respectively.

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement