How do I convert the following dataframe from df1 to df2?
df1 = pd.DataFrame([['a b c', '1 2 3', 2011], ['d e f', '4 5 6', 2012]]) #df1 a b c 0 a b c 1 2 3 2011 1 d e f 4 5 6 2012 df2 = pd.DataFrame([['a', 1, 2011], ['b', 2, 2011], ['c', 3, 2011], ['d', 4, 2012], ['e', 5, 2012], ['f', 6, 2012]]) #df2 a b c 0 a 1 2011 1 b 2 2011 2 c 3 2011 3 d 4 2012 4 e 5 2012 5 f 6 2012
Thanks in advance!
Advertisement
Answer
You could try as follows. Apply .str.split
to columns a
and b
in a loop, each time exploding
the result. This will get us 2 pd.Series
and we use pd.concat
to put them together. After this, we just need to assign col c
, and reset the index. So:
import pandas as pd # adding columns=['a','b','c'] to match data in your post: df1 = pd.DataFrame([['a b c', '1 2 3', 2011], ['d e f', '4 5 6', 2012]], columns=['a','b','c']) res = pd.concat([df1[col].str.split(' ').explode() for col in ['a','b']], axis=1) # change dtype for second col res['b'] = res['b'].astype(int) # assign column `c` and finally, reset index res['c'] = df1['c'] res.reset_index(drop=True,inplace=True) print(res) a b c 0 a 1 2011 1 b 2 2011 2 c 3 2011 3 d 4 2012 4 e 5 2012 5 f 6 2012