Skip to content
Advertisement

Pandas: Cannot address column from previously merged multi level data frame

After an data frame aggregation with group by I’m trying to “flatten” the headers into one to properly export the data as CSV:

df.columns = [' '.join(col).strip() for col in df..columns.values]
df.columns

The output looks like that:

Index(['count', 'average', 'mean',
       'sum'],
      dtype='object')

If I call the data frame directly, I get a different information:

df

Output:

                 count average mean sum
col1 col2 col3 
...

It seems like pandas merged the column names, but I still have two levels of column description. If I try to address 2nd level columns, it raises an error:

df.drop('col1', axis = 'columns', level = 0)

Output:

AssertionError: axis must be a MultiIndex

Or

df.drop('col1', axis = 'columns')

Output

KeyError: "['col1'] not found in axis"

So it seems like I’m stuck with something in between. If I export the data frame to CSV and import it again, everything is fine:

df.to_csv('data.csv')

And

df = df.load_csv('data.csv')
df.drop('col1', axis = 'columns')

So, what am I misunderstanding and doing wrong here?

Advertisement

Answer

You probably want to do df.reset_index() after the df.groupby statement, to “flatten” the headers as requested. See https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.reset_index.html

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement