I have the following dataframe: I want to create pivot table to get the following table: I have tried to do this using pivot_table (pandas): but I get this error: DataError: No numeric types to aggregate I have read this post but it seems to be a bit different as I don't want to change the columns and also I

pandas pivot_tables doesn’t work with date data (No numeric types to aggregate)

I have the following dataframe:

index   id    code    data    date
0      AZ234  B213    apple   2020-09-01 <- duplicate id, code, data
1      AZ234  B213    apple   2022-02-02 <- duplicate id, code, data
2      AZ234  B213    banana  2020-07-01
3      AZ234  B213    orange  2020-05-11
4      AL612  B309    apple   2020-12-05
5      AL612  B309    banana  2020-07-21
6      AL612  B309    orange  2020-09-21

...

JavaScript
​x
 
index   id    code    data    date
0      AZ234  B213    apple   2020-09-01 <- duplicate id, code, data
1      AZ234  B213    apple   2022-02-02 <- duplicate id, code, data
2      AZ234  B213    banana  2020-07-01
3      AZ234  B213    orange  2020-05-11
4      AL612  B309    apple   2020-12-05
5      AL612  B309    banana  2020-07-21
6      AL612  B309    orange  2020-09-21
​
...
​

I want to create pivot table to get the following table:

id    code    apple         banana        orange
AZ234  B213   2020-09-01    2020-07-01     2020-05-11
AL612  B309   2020-12-05    2020-07-21     2020-09-21
...

JavaScript
 
id    code    apple         banana        orange
AZ234  B213   2020-09-01    2020-07-01     2020-05-11
AL612  B309   2020-12-05    2020-07-21     2020-09-21
...
​

I have tried to do this using pivot_table (pandas):

pd.pivot_table(df, values='date', index=['id','code'],
                       columns=['data'])

JavaScript
 
pd.pivot_table(df, values='date', index=['id','code'],
                       columns=['data'])
​

but I get this error:

DataError: No numeric types to aggregate

I have read this post but it seems to be a bit different as I don’t want to change the columns and also I got error when I tried to set_index with code and id ( ” ValueError: Index contains duplicate entries, cannot reshape”).

My goal is to create pivot table with dates as values of the table.

Answer

There are duplicates per id, date, data so is necessary add some aggregate function:

If there are datetimes:

df['date'] = pd.to_datetime(df['date'])

df.pivot_table(values='date', index=['id','code'], columns=['data'], aggfunc='first')

df.pivot_table(values='date', index=['id','code'], columns=['data'], aggfunc='max')

JavaScript
 
df['date'] = pd.to_datetime(df['date'])
​
df.pivot_table(values='date', index=['id','code'], columns=['data'], aggfunc='first')
​
df.pivot_table(values='date', index=['id','code'], columns=['data'], aggfunc='max')
​

If there are strings:

print (df['date'].dtype)

df.pivot_table(values='date', index=['id','code'], columns=['data'], aggfunc=','.join)

JavaScript
 
print (df['date'].dtype)
​
df.pivot_table(values='date', index=['id','code'], columns=['data'], aggfunc=','.join)
​

Advertisement

Answer