Skip to content
Advertisement

Transform dataframe format

How do I convert the following dataframe from df1 to df2?

df1 = pd.DataFrame([['a b c', '1 2 3', 2011], ['d e f', '4 5 6', 2012]])
#df1
           a       b     c
    0  a b c   1 2 3  2011
    1  d e f   4 5 6  2012


df2 = pd.DataFrame([['a', 1, 2011], ['b', 2, 2011], ['c', 3, 2011],
['d', 4, 2012], ['e', 5, 2012], ['f', 6, 2012]])

#df2
   a  b     c
0  a  1  2011
1  b  2  2011
2  c  3  2011
3  d  4  2012
4  e  5  2012
5  f  6  2012

Thanks in advance!

Advertisement

Answer

You could try as follows. Apply .str.split to columns a and b in a loop, each time exploding the result. This will get us 2 pd.Series and we use pd.concat to put them together. After this, we just need to assign col c, and reset the index. So:

import pandas as pd

# adding columns=['a','b','c'] to match data in your post:
df1 = pd.DataFrame([['a b c', '1 2 3', 2011], ['d e f', '4 5 6', 2012]], columns=['a','b','c'])

res = pd.concat([df1[col].str.split(' ').explode() for col in ['a','b']], axis=1)

# change dtype for second col
res['b'] = res['b'].astype(int)

# assign column `c` and finally, reset index
res['c'] = df1['c']
res.reset_index(drop=True,inplace=True)

print(res)

   a  b     c
0  a  1  2011
1  b  2  2011
2  c  3  2011
3  d  4  2012
4  e  5  2012
5  f  6  2012
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement