Given 2 pandas series, both consisting of lists (i.e. each row in the series is a list), I want to take the set difference of 2 columns
For example, in the dataframe…
pd.DataFrame({ 'A': [[1, 2, 3], [4, 5, 6], [7, 8, 9]], 'B': [[1, 2], [5, 6], [7, 8, 9]] })
I want to create a new column C
, that is set(A) – set(B)…
pd.DataFrame({ 'C': [[3], [4], []] })
Advertisement
Answer
Thanks to: https://www.geeksforgeeks.org/python-difference-two-lists/
def Diff(li1, li2): return list(set(li1) - set(li2)) + list(set(li2) - set(li1)) df['C'] = df.apply(lambda x: Diff(x['A'], x['B']), axis=1)
Output
A B C 0 [1, 2, 3] [1, 2] [3] 1 [4, 5, 6] [5, 6] [4] 2 [7, 8, 9] [7, 8, 9] []