My dataframe has four columns with colors. I want to combine them into one column called “Colors” and use commas to separate the values.
For example, I’m trying to combine into a Colors column like this :
ID Black Red Blue Green Colors 120 NaN red NaN green red, green 121 black Nan blue NaN black, blue
My code is:
df['Colors'] = df[['Black, 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x), axis=1)
But the output for ID 120 is:
, red, , green
And the output for ID 121 is:
black, , blue,
FOUND MY PROBLEM!
Earlier in my code, I replaced "None"
with " "
instead of NaN
. Upon making the change, plus incorporating feedback to insert [x.notnull()]
, it works!
df['Black'].replace('None', np.nan, inplace=True) df['Colors'] = df[['Black, 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x[x.notnull()]), axis=1)
Advertisement
Answer
You just need to handle NaNs
df['Colors'] = df[['Black', 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x[x.notnull()]), axis = 1) ID Black Red Blue Green Colors 0 120 NaN red NaN green red, green 1 121 black NaN blue NaN black, blue