I am looking to compare values of columns in two different datasets and create a column with the results that have matched.
DF1:
JavaScript
x
9
1
Dates
2
0 2021-10-01
3
1 2021-10-02
4
2 2021-10-03
5
3 2021-10-04
6
4 2021-10-05
7
5 2021-10-06
8
6 2021-10-07
9
DF2 =
JavaScript
1
5
1
StartDate User
2
2021-10-04 Doe, John
3
2021-10-07 Ann, Mary
4
2021-10-07 Doe, John
5
Expected Result
DF1:
JavaScript
1
9
1
Dates User
2
0 2021-10-01
3
1 2021-10-02
4
2 2021-10-03
5
3 2021-10-04 Doe, John
6
4 2021-10-05
7
5 2021-10-06
8
6 2021-10-07 Ann, Mary; Doe, John
9
I can see the matches using the following code:
JavaScript
1
2
1
df1= df2.loc[df2['StartDate'].isin(df1['Dates']), 'User']
2
And i can return a 1/0 with the following:
JavaScript
1
2
1
df1= df1.assign(result=df1['Dates'].isin(df2['StartDate']).astype(int))
2
However, i cannot seem to merge them both
Advertisement
Answer
JavaScript
1
2
1
df.join(df2.groupby('StartDate')['User'].apply('; '.join), how='left', on='Dates').fillna('')
2
Output:
JavaScript
1
10
10
1
>>> df
2
Dates User
3
0 2021-10-01
4
1 2021-10-02
5
2 2021-10-03
6
3 2021-10-04 Test1
7
4 2021-10-05
8
5 2021-10-06
9
6 2021-10-07 Test2, Test1
10