Skip to content
Advertisement

Creating new column from existing columns

I have a data-frame

ID   P_1   P_2
1    NaN   NaN
2    124   342
3    NaN   234
4    123   NaN
5    2345  500

I want to make a new column titled P_3 such that:

ID   P_1   P_2  P_3
1    NaN   NaN   NaN
2    124   342   342
3    NaN   234   234
4    123   NaN   123
5    2345  500  500

My conditions are:

if P_1 = Nan , then P_3 == P_2
if P_1 != Nan and P_2 != Nan, then  P_3 == P_2
if P_2 = Nan , then P_3 == P_1

I have applied the following codes:

conditions = [
    (df['P_1'] == float('NaN')),
    (df['P_1'] != float('NaN')) & (df['P_2'] != float('NaN')),
    (df['P_1'] != float('NaN')) & (df['P_2'] == float('NaN'))
    ]

values = [df['P_2'], df['P_2'], df['P_1']]

df['P_3'] = np.select(conditions, values)

But it gives me the following error:

Length of values does not match length of index

Advertisement

Answer

In summary, your unique condition is:

P_3 = P_2 if P_2 != NaN else P_1

combine_first: update null elements with value in the same location in other (ref: Pandas doc.)

>>> df["P_2"].combine_first(df["P_1"])
ID
1      NaN
2    342.0
3    234.0
4    123.0
5    500.0
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement