My dataframe has four columns with colors. I want to combine them into one column called “Colors” and use commas to separate the values.
For example, I’m trying to combine into a Colors column like this :
JavaScript
x
4
1
ID Black Red Blue Green Colors
2
120 NaN red NaN green red, green
3
121 black Nan blue NaN black, blue
4
My code is:
JavaScript
1
2
1
df['Colors'] = df[['Black, 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x), axis=1)
2
But the output for ID 120 is:
JavaScript
1
2
1
, red, , green
2
And the output for ID 121 is:
JavaScript
1
2
1
black, , blue,
2
FOUND MY PROBLEM!
Earlier in my code, I replaced "None"
with " "
instead of NaN
. Upon making the change, plus incorporating feedback to insert [x.notnull()]
, it works!
JavaScript
1
3
1
df['Black'].replace('None', np.nan, inplace=True)
2
df['Colors'] = df[['Black, 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x[x.notnull()]), axis=1)
3
Advertisement
Answer
You just need to handle NaNs
JavaScript
1
6
1
df['Colors'] = df[['Black', 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x[x.notnull()]), axis = 1)
2
3
ID Black Red Blue Green Colors
4
0 120 NaN red NaN green red, green
5
1 121 black NaN blue NaN black, blue
6