I have two dataframes
JavaScript
x
28
28
1
df1 = pd.DataFrame({
2
'Date':['2013-11-24','2013-11-24','2013-11-25','2013-11-25'],
3
'Fruit':['Banana','Orange','Apple','Celery'],
4
'Num':[22.1,8.6,7.6,10.2],
5
'Color':['Yellow','Orange','Green','Green'],
6
})
7
print(df1)
8
Date Fruit Num Color
9
0 2013-11-24 Banana 22.1 Yellow
10
1 2013-11-24 Orange 8.6 Orange
11
2 2013-11-25 Apple 7.6 Green
12
3 2013-11-25 Celery 10.2 Green
13
14
df2 = pd.DataFrame({
15
'Date':['2013-11-25','2013-11-25','2013-11-25','2013-11-25','2013-11-25','2013-11-25'],
16
'Fruit':['Banana','Orange','Apple','Celery','X','Y'],
17
'Num':[22.1,8.6,7.6,10.2,22.1,8.6],
18
'Color':['Yellow','Orange','Green','Green','Red','Orange'],
19
})
20
print(df2)
21
Date Fruit Num Color
22
0 2013-11-25 Banana 22.1 Yellow
23
1 2013-11-25 Orange 8.6 Orange
24
2 2013-11-25 Apple 7.6 Green
25
3 2013-11-25 Celery 10.2 Green
26
4 2013-11-25 X 22.1 Red
27
5 2013-11-25 Y 8.6 Orange
28
I am trying to find out the difference between these two dataframes based on the column Fruit
This is what i am doing now but i am not getting the expected output
JavaScript
1
3
1
mapped_df = pd.concat([df1,df2],ignore_index=True).drop_duplicates(keep=False)
2
print(mapped_df)
3
Expected output
JavaScript
1
4
1
Date Fruit Num Color
2
8 2013-11-25 X 22.1 Red
3
9 2013-11-25 Y 8.6 Orange
4
Advertisement
Answer
You can use the negated isin
:
JavaScript
1
2
1
output = df2.loc[~df2['Fruit'].isin(df1['Fruit'])]
2