Skip to content
Advertisement

Dictionary from list comprehension

I have a following list.

arr_lst = [(1, 34, 99), (2, 35, 40), (2, 36, 50), (2, 37, 10), (3, 37, 90), (3, 38, 8)]

I found dictionary keys, which is the first item in the tuple.

keys = {i[0] for i in arr_lst}
# output
# {1, 2, 3)

From there, I created a dictionary, whose values are the tuples from the first list if the first item in the tuple matches with the keys defined previously.

id_dict = dict()
for k in keys:
    id_dict[k] = [i for i in arr_lst if i[0] == k]
# output
# {1: [(1, 34, 99)], 2: [(2, 35, 40), (2, 36, 50), (2, 37, 10)], 3: [(3, 37, 90), (3, 38, 8)]}

Then I created a new list whose elements are tuples based on the dictionary values. The tuple whose third item is the highest in the dictionary values, gets appended to the list with the third item removed.

output_id_etak_id = []
for k, v in id_dict.items():
    m = max(v, key=lambda x: x[2])
    output_id_etak_id.append(m[:2])
# output
# [(1, 34), (2, 36), (3, 37)]

The code works and I get the desired output. However, I have a large dataset with over 800 000 elements in the first list, and it currently takes about 3 hours to run. I would like to find a way to make it faster.

Advertisement

Answer

You can itertools.groupby(expects sorted input) to form groups based on the tuple 1st element and then select the first element of that group using next(suggested by @tobias-k).

Note: In order to use next we need to sort the list using the comparison key x[0],-x[-1] so that the groups formed are in descending order.

from itertools import groupby

arr_lst = [(1, 34, 99), (2, 35, 40), (2, 36, 50), (2, 37, 10), (3, 37, 90), (3, 38, 8)]
arr_lst = sorted(arr_lst, key=lambda x: (x[0], -x[-1]))
result = [
    next(group)[:2]
    for key, group in groupby(arr_lst, key=lambda x: x[0])
]
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement