Skip to content
Advertisement

Filter 2d list by another 2d list

I have a list A:

A = [['512', '102']
['410', '105']
['820', '520']]

And list B:

B = [['510', '490', '512', '912']
['512', '108', '102', '520' , '901', '821']
['510', '118', '284']]

I would like to leave only these rows in list A, that all values of which are contained in at least one row from list B. So my expected output is:

[['512', '102']]

Because values ‘512’ and ‘102’ are in second row of list B.

I know how to achieve that by iterating or every item in list A and compare with every element in list B but the problem is that I have ~500000 rows in list A and ~10000 rows in list B and it is extremely slow.

Is there a way to achieve that in a more optimal way?

Advertisement

Answer

You must definitely work with sets here, as they are much faster than lists.

Here is one solution:

[i for i in A if any(set(i)-set(k)==set() for k in B)]

result

[['512', '102']]

Explanation:

set(i)-set(k)==set()

checks if all items of i are included in k

any(set(i)-set(k)==set() for k in B)

checks if the above is valid for any item of B for specific item of A and finally

[i for i in A if any(set(i)-set(k)==set() for k in B)]

returns all items of A that satisfy the above condition

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement