I have a dataframe named calls, each call is recorded by a date (calls)[datetime]. each call has an answer status (calls)[Status].
a = {'datetime':['2021-11-22 18:47:02','2021-01-08 14:18:11','2021-06-08 16:40:45','2021-12-03 10:21:31','2021-12-03 15:21:31'], 'Status': ['P_TR_SOK', 'RS_GEN', 'P_TR_ST_','RS_GEN', 'MLR_ST']} calls = pd.DataFrame(data = a)
A second Dataframe named NewStatus with same column (NewStatus
)[datetime] and the column (NewStatus
)[New_Status] that I want to replace in the first dataframe with a date join
b = {'datetime': ['2021-11-22 18:47:02','2021-01-08 14:18:11','2021-06-08 16:40:45','2021-12-03 10:21:31'], 'New_Status': ['AB_CL_NO','REP','AB_AUT','DROP']} NewStatus = pd.DataFrame(data = b)
Desired result is the following for calls :
datetime | Status |
---|---|
2021-11-22 18:47:02 | AB_CL_NO |
2021-01-08 14:18:11 | REP |
2021-06-08 16:40:45 | AB_AUT |
2021-12-03 10:21:31 | DROP |
2021-12-03 15:21:31 | MLR_ST |
By using
calls.reset_index().set_index('datetime').join(NewStatus.drop_duplicates().set_index('datetime'), how='left', rsuffix='df2')
I am blocked how to replace the old Status by making the join with “datetime”
Advertisement
Answer
You could do it like this:
UPDATE
Be aware that pd.update
works inplace
.
calls= calls.set_index('datetime') calls.update(NewStatus.set_index('datetime').rename(columns={'New_Status': 'Status'})) print(calls) Status datetime 2021-11-22 18:47:02 AB_CL_NO 2021-01-08 14:18:11 REP 2021-06-08 16:40:45 AB_AUT 2021-12-03 10:21:31 DROP 2021-12-03 15:21:31 MLR_ST
Old answer
calls= calls.set_index('datetime') out = (calls.join(NewStatus.drop_duplicates() .set_index('datetime') .rename(columns={'New_Status':'Status'}), how='left', rsuffix='_updated') .rename(columns={'Status_updated': 'Replaced_Status'}) .drop('Status', axis=1) .dropna()) print(out) Replaced_Status datetime 2021-11-22 18:47:02 AB_CL_NO 2021-01-08 14:18:11 REP 2021-06-08 16:40:45 AB_AUT 2021-12-03 10:21:31 DROP