I have a df which looks like this:
df = pd.DataFrame({'Date':['2019-09-23','2019-09-24','2019-09-25','2019-09-26','2019-09-27','2019-09-28','2019-09-29'], 'Sep':[1,10,5,'NaN','NaN','NaN','NaN'], 'Dec':[2,8,4,7,9,1,5]})
I’m trying to create a new column called ‘First_Contract’:
- ‘First_Contract’ needs to take the third-last value of ‘Sep’ column, before ‘Sep’column reaches NaN.
- The subsequent values need to be filled with ‘Dec’ column values.
Desired output:
df2= pd.DataFrame({'Date':['2019-09-23','2019-09-24','2019-09-25','2019-09-26','2019-09-27','2019-09-28','2019-09-29'], 'Sep':[1,10,5,'NaN','NaN','NaN','NaN'], 'Dec':[2,8,4,7,9,1,5], 'First_Contract':[1,8,4,7,9,1,5]})
How do I go about to achieve this?
Advertisement
Answer
Let us do it step by step
df.Sep.replace({'NaN': np.nan}, inplace=True) df['FC'] = df['Dec'] ids = df.Sep.last_valid_index()-2 df.loc[ids,'FC'] = df.Sep[ids] df Out[126]: Date Sep Dec First_Contract FC 0 2019-09-23 1.0 2 1 1.0 1 2019-09-24 10.0 8 8 8.0 2 2019-09-25 5.0 4 4 4.0 3 2019-09-26 NaN 7 7 7.0 4 2019-09-27 NaN 9 9 9.0 5 2019-09-28 NaN 1 1 1.0 6 2019-09-29 NaN 5 5 5.0