Skip to content
Advertisement

If there is a second column present then populate second column values, else populate first column values in Dataframe

I have a dataframe as seen below:

col_a1, col_a2, col_b1, col_b2
 abc                     lmn
 def     ghi             qrs
 zxv                     vbn
 pej             iop     qaz
 eki     lod     yhe     wqe

I need two columns now, column A and Column B. Conditions summarized:

Column A = col_a2 if col_a2 is present else col_a1  
Column B = col_a1 if col_a1 is present else col_b2

The required dataframe should be as follows:

 Column A Column B
    abc    lmn
    ghi    qrs
    zxv    vbn
    pej    iop
    lod    yhe

Advertisement

Answer

Try:

df['A'] = df.apply(lambda x: x['col_a2'] if x['col_a2'] != '' else x['col_a1'], axis=1)
df['B'] = df.apply(lambda x: x['col_b1'] if x['col_b1'] != '' else x['col_b2'], axis=1)
print(df[['A', 'B']])

    A       B
0   abc     lmn
1   ghi     qrs
2   zxv     vbn
3   pej     iop
4   lod     yhe

The !='' will work if you truly have nothing in the cell (as opposed to a NaN etc.). If you have actual NaN values use:

df['A'] = df.apply(lambda x: x['col_a2'] if pd.notna(x['col_a2']) else x['col_a1'], axis=1)
df['B'] = df.apply(lambda x: x['col_b1'] if pd.notna(x['col_b1']) else x['col_b2'], axis=1)
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement