I have a df which looks like this:
df = pd.DataFrame({'Date':['2019-09-23','2019-09-24','2019-09-25','2019-09-26','2019-09-27','2019-09-28','2019-09-29'],
'Sep':[1,10,5,'NaN','NaN','NaN','NaN'],
'Dec':[2,8,4,7,9,1,5]})
I’m trying to create a new column called ‘First_Contract’:
- ‘First_Contract’ needs to take the third-last value of ‘Sep’ column, before ‘Sep’column reaches NaN.
- The subsequent values need to be filled with ‘Dec’ column values.
Desired output:
df2= pd.DataFrame({'Date':['2019-09-23','2019-09-24','2019-09-25','2019-09-26','2019-09-27','2019-09-28','2019-09-29'],
'Sep':[1,10,5,'NaN','NaN','NaN','NaN'],
'Dec':[2,8,4,7,9,1,5],
'First_Contract':[1,8,4,7,9,1,5]})
How do I go about to achieve this?
Advertisement
Answer
Let us do it step by step
df.Sep.replace({'NaN': np.nan}, inplace=True)
df['FC'] = df['Dec']
ids = df.Sep.last_valid_index()-2
df.loc[ids,'FC'] = df.Sep[ids]
df
Out[126]:
Date Sep Dec First_Contract FC
0 2019-09-23 1.0 2 1 1.0
1 2019-09-24 10.0 8 8 8.0
2 2019-09-25 5.0 4 4 4.0
3 2019-09-26 NaN 7 7 7.0
4 2019-09-27 NaN 9 9 9.0
5 2019-09-28 NaN 1 1 1.0
6 2019-09-29 NaN 5 5 5.0

