I have a nested hearders in a dataframe after pivoting it and I'd like to make it flat, with repeated item on each rows (like in a excel) May you help me ? where I'd like to get a flat header and reapeated rows : many thanks Nico Answer You need to use droplevel(0) which will remove the top level

change dataframe pivot_table headers to save it in an excel file

I have a nested hearders in a dataframe after pivoting it and I’d like to make it flat, with repeated item on each rows (like in a excel) May you help me ?

import pandas as pd

data = {
  "year": [2022, 2022 , 2021 , 2021 , 2020 ],
  "client": ["A", "B", "B", "C", "C"],
  "product" : [ "P1", "P2" , "P1", "P2", "P1"],
  "sales" : [ 10,20, 20, 22, 25]

}

df = pd.DataFrame(data)

df2 = df.pivot_table ( index = ["year","client"] , columns = ["product"] , values = ["sales"] , aggfunc ="sum")
df2

              sales
product      P1     P2
year    client      
2020    C   25.0    NaN
2021    B   20.0    NaN
        C   NaN     22.0
2022    A   10.0    NaN
        B   NaN     20.0

JavaScript
​x
 
import pandas as pd
​
data = {
  "year": [2022, 2022 , 2021 , 2021 , 2020 ],
  "client": ["A", "B", "B", "C", "C"],
  "product" : [ "P1", "P2" , "P1", "P2", "P1"],
  "sales" : [ 10,20, 20, 22, 25]
​
}
​
df = pd.DataFrame(data)
​
df2 = df.pivot_table ( index = ["year","client"] , columns = ["product"] , values = ["sales"] , aggfunc ="sum")
df2
​
              sales
product      P1     P2
year    client      
2020    C   25.0    NaN
2021    B   20.0    NaN
        C   NaN     22.0
2022    A   10.0    NaN
        B   NaN     20.0
​

where I’d like to get a flat header and reapeated rows :

year    client P1   P2  
2020    C   25.0    NaN
2021    B   20.0    NaN
2021    C   NaN     22.0
2022    A   10.0    NaN
2022    B   NaN     20.0

JavaScript
 
year    client P1   P2  
2020    C   25.0    NaN
2021    B   20.0    NaN
2021    C   NaN     22.0
2022    A   10.0    NaN
2022    B   NaN     20.0
​

many thanks

Nico

Answer

You need to use droplevel(0) which will remove the top level (Sales). Setting columns.name to None will remove the Products. So, after df2 is created, add these lines…

>> df2.columns = df2.columns.droplevel(0)
>> df2.columns.name = None
>> df2 =df2.reset_index()
>> df2

    year    client  P1  P2
0   2020    C   25.0    NaN
1   2021    B   20.0    NaN
2   2021    C   NaN 22.0
3   2022    A   10.0    NaN
4   2022    B   NaN 20.0

JavaScript
 
>> df2.columns = df2.columns.droplevel(0)
>> df2.columns.name = None
>> df2 =df2.reset_index()
>> df2
​
    year    client  P1  P2
0   2020    C   25.0    NaN
1   2021    B   20.0    NaN
2   2021    C   NaN 22.0
3   2022    A   10.0    NaN
4   2022    B   NaN 20.0
​

Advertisement

Answer