Given 2 pandas series, both consisting of lists (i.e. each row in the series is a list), I want to take the set difference of 2 columns
For example, in the dataframe…
JavaScript
x
5
1
pd.DataFrame({
2
'A': [[1, 2, 3], [4, 5, 6], [7, 8, 9]],
3
'B': [[1, 2], [5, 6], [7, 8, 9]]
4
})
5
I want to create a new column C
, that is set(A) – set(B)…
JavaScript
1
4
1
pd.DataFrame({
2
'C': [[3], [4], []]
3
})
4
Advertisement
Answer
Thanks to: https://www.geeksforgeeks.org/python-difference-two-lists/
JavaScript
1
5
1
def Diff(li1, li2):
2
return list(set(li1) - set(li2)) + list(set(li2) - set(li1))
3
4
df['C'] = df.apply(lambda x: Diff(x['A'], x['B']), axis=1)
5
Output
JavaScript
1
5
1
A B C
2
0 [1, 2, 3] [1, 2] [3]
3
1 [4, 5, 6] [5, 6] [4]
4
2 [7, 8, 9] [7, 8, 9] []
5