I have a dataframe as seen below:
col_a1, col_a2, col_b1, col_b2 abc lmn def ghi qrs zxv vbn pej iop qaz eki lod yhe wqe
I need two columns now, column A and Column B. Conditions summarized:
Column A = col_a2 if col_a2 is present else col_a1 Column B = col_a1 if col_a1 is present else col_b2
The required dataframe should be as follows:
Column A Column B abc lmn ghi qrs zxv vbn pej iop lod yhe
Advertisement
Answer
Try:
df['A'] = df.apply(lambda x: x['col_a2'] if x['col_a2'] != '' else x['col_a1'], axis=1) df['B'] = df.apply(lambda x: x['col_b1'] if x['col_b1'] != '' else x['col_b2'], axis=1) print(df[['A', 'B']]) A B 0 abc lmn 1 ghi qrs 2 zxv vbn 3 pej iop 4 lod yhe
The !=''
will work if you truly have nothing in the cell (as opposed to a NaN
etc.). If you have actual NaN values use:
df['A'] = df.apply(lambda x: x['col_a2'] if pd.notna(x['col_a2']) else x['col_a1'], axis=1) df['B'] = df.apply(lambda x: x['col_b1'] if pd.notna(x['col_b1']) else x['col_b2'], axis=1)