I am looking to compare values of columns in two different datasets and create a column with the results that have matched.
DF1:
Dates 0 2021-10-01 1 2021-10-02 2 2021-10-03 3 2021-10-04 4 2021-10-05 5 2021-10-06 6 2021-10-07
DF2 =
StartDate User 2021-10-04 Doe, John 2021-10-07 Ann, Mary 2021-10-07 Doe, John
Expected Result
DF1:
Dates User 0 2021-10-01 1 2021-10-02 2 2021-10-03 3 2021-10-04 Doe, John 4 2021-10-05 5 2021-10-06 6 2021-10-07 Ann, Mary; Doe, John
I can see the matches using the following code:
df1= df2.loc[df2['StartDate'].isin(df1['Dates']), 'User']
And i can return a 1/0 with the following:
df1= df1.assign(result=df1['Dates'].isin(df2['StartDate']).astype(int))
However, i cannot seem to merge them both
Advertisement
Answer
df.join(df2.groupby('StartDate')['User'].apply('; '.join), how='left', on='Dates').fillna('')
Output:
>>> df Dates User 0 2021-10-01 1 2021-10-02 2 2021-10-03 3 2021-10-04 Test1 4 2021-10-05 5 2021-10-06 6 2021-10-07 Test2, Test1