How do I create a new column which joins the column names for any non na values on a per row basis.
- Please note the duplicate index.
Code
JavaScript
x
4
1
so_df = pd.DataFrame({"ma_1":[10,np.nan,13,15],
2
"ma_2":[10,11,np.nan,15],
3
"ma_3":[np.nan,11,np.nan,15]},index=[0,1,1,2])
4
Example DF
JavaScript
1
6
1
ma_1 ma_2 ma_3
2
0 10.0 10.0 NaN
3
1 NaN 11.0 11.0
4
1 13.0 NaN NaN
5
2 15.0 15.0 15.0
6
Desired output is a new column which joins the column names for non na values as per col_names
example below.
JavaScript
1
9
1
so_df["col_names"] = ["ma_1, ma_2","ma_2, ma_3","ma_1","ma_1, ma_2, ma_3"]
2
3
4
ma_1 ma_2 ma_3 col_names
5
0 10.0 10.0 NaN ma_1, ma_2
6
1 NaN 11.0 11.0 ma_2, ma_3
7
1 13.0 NaN NaN ma_1
8
2 15.0 15.0 15.0 ma_1, ma_2, ma_3
9
Advertisement
Answer
Try with dot
JavaScript
1
9
1
df['new'] = df.notna().dot(df.columns+',').str[:-1]
2
df
3
Out[77]:
4
ma_1 ma_2 ma_3 new
5
0 10.0 10.0 NaN ma_1,ma_2
6
1 NaN 11.0 11.0 ma_2,ma_3
7
1 13.0 NaN NaN ma_1
8
2 15.0 15.0 15.0 ma_1,ma_2,ma_3
9