Skip to content
Advertisement

pandas reshape multiple columns fails with KeyError

For a pandas dataframe of:

enter image description here

defined by:

import pandas as pd
df = pd.DataFrame({'id':[1,2,3], 're_foo':[1,2,3], 're_bar':[4,5,6], 're_foo_baz':[0.4, 0.8, .9], 're_bar_baz':[.4,.5,.6], 'iteration':[1,2,3]})
display(df)

I want to reshape to the following format:

id, metric, value, iteration
1, foo    , 1    , 1
1, bar    , 4    , 1
1, foo_baz, 0.4  , 1
1, bar_baz, 0.4  , 0.4
...

A:

pd.wide_to_long(r, stubnames='re', i=['re_foo', 're_bar', 're_foo_baz', 're_bar_baz'], j='metric')

only results in a KeyError. How can I fix the reshape to work fine?

Advertisement

Answer

Here’s a way using stack:

# fix column name, remove re_
df.columns = df.columns.str.replace(r're_', '')

# reshape dataframe into required format
df = df.set_index(['id','iteration']).stack().reset_index().rename(columns={'level_2':'metric', 0: 'value'})

    id  iteration   metric  value
0    1          1      foo    1.0
1    1          1      bar    4.0
2    1          1  foo_baz    0.4
3    1          1  bar_baz    0.4
4    2          2      foo    2.0
5    2          2      bar    5.0
6    2          2  foo_baz    0.8
7    2          2  bar_baz    0.5
8    3          3      foo    3.0
9    3          3      bar    6.0
10   3          3  foo_baz    0.9
11   3          3  bar_baz    0.6
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement