Creating another column in pandas df based on partially empty columns

Question

I want to create a third column in my pandas dataframe that is based on cols 1 and 2. They are always matching, but I want to make it so that the third column takes whichever value is available. If I just go off of id1, sometimes it is blank, so the third col will end up being blank as

Accepted Answer

Backfill values from id2 to id1. Extract the numbers. Convert to int then str.Given:    id1   id20  ID01  ID011   NaN  ID032  ID07   NaN3  ID08  ID08Doing:df['college_name'] = 'College' + (df.bfill(axis=1)['id1']                                    .str.extract('(d+)')                                    .astype(int)                                    .astype(str))Output:    id1   id2 college_name0  ID01  ID01     College11   NaN  ID03     College32  ID07   NaN     College73  ID08  ID08     College8To check for rows where the ids are different:Given:    id1   id20  ID01  ID011   NaN  ID032  ID07   NaN3  ID08  ID98Doing:print(df[df.id1.ne(df.id2) & df.id1.notna() & df.id2.notna()])Output:    id1   id23  ID08  ID98

Advertisement

Answer