Skip to content
Advertisement

How to combine DataFrame columns of strings into a single column?

I have a DF with about 50 columns. 5 of them contain strings that I want to combine into a single column, separating the strings with commas but also keeping the spaces within each of the strings. Moreover, some values are missing (NaN). The last requirement would be to remove duplicates if they exist.

So I have something like this in my DF:

symptom_1 symptom_2 symptom_3 symptom_4 symptom 5
muscle pain super headache diarrhea Sore throat Fatigue
super rash ulcera super headache
diarrhea super diarrhea
something awful something awful

And I need something like this:

symptom_1 symptom_2 symptom_3 symptom_4 symptom 5 all_symptoms
muscle pain super headache diarrhea Sore throat Fatigue muscle pain, super headache, diarrhea, Sore throat, Fatigue
super rash ulcera super headache super rash, ulcera, headache
diarrhea super diarrhea diarrhea, super diarrhea
something awful something awful something awful

I wrote the following function and while it merges all the columns it does not respect the spaces within the original strings, which is a must.

def merge_columns_into_one(DataFrame, columns_to_combine, new_col_name, drop_originals = False):
    DataFrame[new_col_name] = DataFrame[columns_to_combine].apply(lambda x: ','.join(x.dropna().astype(str)),axis=1)
    return DataFrame

Thanks in advance for your help!

edit: when I’m writing this question the second markdown table appears just fine in the preview, but as soon as I post it the table loses it’s format. I hope you get the idea of what I’m trying to do. Else I’d appreciate your feedback on how to fix the MD table.

Advertisement

Answer

Just use fillna() , apply() and rstrip() method:

df['all_symptoms']=df1.fillna('').apply(pd.unique,1).apply(','.join).str.rstrip(',')

Now if you print df you will get your desired output:

symptom_1 symptom_2 symptom_3 symptom_4 symptom 5 all_symptoms
muscle pain super headache diarrhea Sore throat Fatigue muscle pain, super headache, diarrhea, Sore throat, Fatigue
super rash ulcera super headache super rash, ulcera, headache
diarrhea super diarrhea diarrhea, super diarrhea
something awful something awful something awful
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement