Skip to content
Advertisement

Split column into multiple columns with unique values in pandas

I have the following dataframe:

   Col
0  A,B,C
1  B,A,D
2  C
3  A,D,E,F
4  B,C,F
df = pd.DataFrame({'Col': ['A,B,C', 'B,A,D', 'C', 'A,D,E,F', 'B,C,F']})

which needs to be turned into:

   A B C D E F
0  A B C
1  A B   D
2      C
3  A     D E F
4    B C     F

Advertisement

Answer

Using pandas.concat:

pd.concat([pd.Series((idx:=x.split(',')), index=idx)
           for x in df['Col']], axis=1).T

For python < 3.8:

pd.concat([pd.Series(val, index=val)
           for x in df['Col']
           for val in [x.split(',')]], axis=1).T

Output:

     A    B    C    D    E    F
0    A    B    C  NaN  NaN  NaN
1    A    B  NaN    D  NaN  NaN
2  NaN  NaN    C  NaN  NaN  NaN
3    A  NaN  NaN    D    E    F
4  NaN    B    C  NaN  NaN    F

NB. add fillna('') to have empty strings for missing values

   A  B  C  D  E  F
0  A  B  C         
1  A  B     D      
2        C         
3  A        D  E  F
4     B  C        F
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement