There seem to be a lot of possibilities to pivot flat table data into a 3d array but I’m somehow not finding one that works: Suppose I have some data with columns=[‘name’, ‘type’, ‘date’, ‘value’]. When I try to pivot via
pivot(index='name', columns=['type', 'date'], values='value')
I get
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)
Am I reading docs from dev pandas maybe? It seems like this is the usage described there. I am running 0.8 pandas.
I guess, I’m wondering if I have a MultiIndex [‘x’, ‘y’, ‘z’] Series, is there a pandas way to put that in a panel? I can use groupby and get the job done, but then this is almost like what I would do in numpy to assemble an n-d array. Seems like a fairly generic operation so I would imagine it might be implemented already.
Advertisement
Answer
pivot
only supports using a single column to generate your columns. You probably want to use pivot_table
to generate a pivot table using multiple columns e.g.
pandas.tools.pivot.pivot_table(your_dataframe, values='value', index='name', columns=['type', 'date'], aggfunc='sum')
The hierarchical columns that are mentioned in the API reference and documentation for pivot
relates to cases where you have multiple value fields rather than multiple categories.
Assuming ‘type’ and ‘date’ are categories, whose values should be used as the column names, then you should use pivot_table
.
However, if you want separate columns for different value fields for the same category (e.g. ‘type’), then you should use pivot
without specifying the value column and your category as the columns parameter.
For example, suppose you have this DataFrame:
df = DataFrame({'name': ['A', 'B', 'A', 'B'], 'type': [1, 1, 2, 2], 'date': ['2012-01-01', '2012-01-01', '2012-02-01', '2012-02-01'], 'value': [1, 2, 3, 4]}) pt = df.pivot_table(values='value', index='name', columns=['type', 'date']) p = df.pivot('name', 'type')
pt will be:
type 1 2 date 2012-01-01 2012-02-01 name A 1 3 B 2 4
and p will be:
date value type 1 2 1 2 name A 2012-01-01 2012-02-01 1 3 B 2012-01-01 2012-02-01 2 4
NOTE: For pandas version < 0.14.0, the index
and columns
keyword arguments should be replaced with rows
and cols
respectively.