python a faster method of finding indexes in a list of 2million+ data that match string condition

Question

Hello, I have shared the reproducible code above. Background: Let me quickly introduce my data station_combinations is the cross product of &#8220;my_list&#8221; separated by the notation &#8220;_&#8221;. You can think of it as a destination among &#8220;my_list&#8221; items so 1_2 would be going from 1 to 2 …

Accepted Answer

One solution can be using indexes, in this case two indexes for a and b. For example:my_list = list(range(1700))import itertoolscross_product = list(itertools.product(my_list, my_list))station_combinations = [    "_".join([str(i), str(b)]) for i, b in cross_product if i != b]# precompute indexes:index_a = {}index_b = {}for i, s in enumerate(station_combinations):    a, b = s.split("_")    index_a.setdefault(a, []).append(i)    index_b.setdefault(b, []).append(i)from time import time, sleepstation_name = "5"start = time()for h in range(10):    reverse_indexes_new = index_b.get(station_name, [])    regular_indexes_new = index_a.get(station_name, [])print(time() - start)Prints on my machine:7.62939453125e-06

Advertisement

Answer