Skip to content
Advertisement

Most efficient way to compare the values of list one with list two returning a subset of list one

I have two lists of lists

trecords = [[1072, 'chickens', 'are', 'messy'], [1073,'chickens', 'eat', 'grass'],...]
srecords = [[1, 'chickens', 'attack', 'lizards'], [12,'chickens', 'eat', 'grass'],...]

I need to compare the final values of each list and returning the numerical bit of list a that is not contained in list b … producing a list of values like [1072,...]

The following code works:

droplist = []
for trecord in trecords:
    if trecord[1::] not in [srecord[1::] for srecord in srecords]:
        droplist.append(trecord[0])

I would rather have something like this if it is faster:

droplist = [trecord[0] for trecord in trecords if trecords[1::] not in srecords[1::]]

But that is matching on every value and I do not know why. Actual length of the lists is 300k values each. Is this the fastest way to compare them? I’ve also put the data in a dictionary with the numeral being the key and the text (a list) as the value but that seems slower.

Advertisement

Answer

here, I think you just needed 1 more level of nesting in your list comprehension

droplist = [trecord[0] for trecord in trecords if trecord[1:] not in [srecord[1:] for srecord in srecords]]
Advertisement