I have a dataframe as seen below:
col_a1, col_a2, col_b1, col_b2 abc lmn def ghi qrs zxv vbn pej iop qaz eki lod yhe wqe
I need two columns now, column A and Column B. Conditions summarized:
Column A = col_a2 if col_a2 is present else col_a1 Column B = col_a1 if col_a1 is present else col_b2
The required dataframe should be as follows:
Column A Column B
abc lmn
ghi qrs
zxv vbn
pej iop
lod yhe
Advertisement
Answer
Try:
df['A'] = df.apply(lambda x: x['col_a2'] if x['col_a2'] != '' else x['col_a1'], axis=1)
df['B'] = df.apply(lambda x: x['col_b1'] if x['col_b1'] != '' else x['col_b2'], axis=1)
print(df[['A', 'B']])
A B
0 abc lmn
1 ghi qrs
2 zxv vbn
3 pej iop
4 lod yhe
The !='' will work if you truly have nothing in the cell (as opposed to a NaN etc.). If you have actual NaN values use:
df['A'] = df.apply(lambda x: x['col_a2'] if pd.notna(x['col_a2']) else x['col_a1'], axis=1) df['B'] = df.apply(lambda x: x['col_b1'] if pd.notna(x['col_b1']) else x['col_b2'], axis=1)