Skip to content
Advertisement

Create Pandas DataFrame column which joins column names for any non na values

How do I create a new column which joins the column names for any non na values on a per row basis.

  • Please note the duplicate index.

Code

so_df = pd.DataFrame({"ma_1":[10,np.nan,13,15],
             "ma_2":[10,11,np.nan,15],
             "ma_3":[np.nan,11,np.nan,15]},index=[0,1,1,2])

Example DF

   ma_1     ma_2    ma_3
0   10.0    10.0    NaN
1   NaN     11.0    11.0
1   13.0    NaN     NaN
2   15.0    15.0    15.0

Desired output is a new column which joins the column names for non na values as per col_names example below.

so_df["col_names"] = ["ma_1, ma_2","ma_2, ma_3","ma_1","ma_1, ma_2, ma_3"]


    ma_1    ma_2    ma_3    col_names
0   10.0    10.0    NaN     ma_1, ma_2
1   NaN     11.0    11.0    ma_2, ma_3
1   13.0    NaN     NaN     ma_1
2   15.0    15.0    15.0    ma_1, ma_2, ma_3

Advertisement

Answer

Try with dot

df['new'] = df.notna().dot(df.columns+',').str[:-1]
df
Out[77]: 
   ma_1  ma_2  ma_3             new
0  10.0  10.0   NaN       ma_1,ma_2
1   NaN  11.0  11.0       ma_2,ma_3
1  13.0   NaN   NaN            ma_1
2  15.0  15.0  15.0  ma_1,ma_2,ma_3
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement