I have a relatively large array called allListings
and want to filter out all rows where allListings[:][14] == listingID
.
This is the code I am using:
tempRows = list(filter(lambda x: x[14] == listingID, allListings))
The filtering is repeated in a for loop for all different listingID
Profiling shows, that this line consumes 95% of the runtime in the loop. Is there any other way to filter large arrays more efficiently?
Advertisement
Answer
As suggested in comments, you may want to sort and group by this column if you are performing multiple operations on it based on the value of that column.
>>> from itertools import groupby >>> a = [[1, 2, 3, 5], ... [4, 6, 2, 8], ... [1, 5, 7, 9], ... [3, 5, 8, 2]] >>> b = sorted(a, key=lambda x: x[0]) >>> b [[1, 2, 3, 5], [1, 5, 7, 9], [3, 5, 8, 2], [4, 6, 2, 8]] >>> c = groupby(b, key=lambda x: x[0]) >>> c <itertools.groupby object at 0x106b763e0> >>> d = {k: list(v) for k, v in c} >>> d {1: [[1, 2, 3, 5], [1, 5, 7, 9]], 3: [[3, 5, 8, 2]], 4: [[4, 6, 2, 8]]}
Now, if you need all lists where the first element is 1
, you simply need:
>>> d[1] [[1, 2, 3, 5], [1, 5, 7, 9]]
Or if you wanted everything but 1 in that first position.
>>> [x for k, v in d.items() ... if k != 1 ... for x in v] [[3, 5, 8, 2], [4, 6, 2, 8]]
This is obviously a simpler example, but should be easily applicable to your situation.