Iterating Through List of Lists, Maintaining List Structure

Question

Say I have the following list of list of names: I want to return only the "Matts" in the list, but I also want to maintain the list of list structure. So I want to return: I've something like this, but this will append everthting together in one big list: I know something like this is possible, but I want

Accepted Answer

Use a nested list comprehension, as below:names = [['Matt', 'Matt', 'Paul'], ['Matt']]res = [[name for name in lst if name == "Matt"] for lst in names]print(res)Output[['Matt', 'Matt'], ['Matt']]The above nested list comprehension is equivalent to the following for-loop:res = []for lst in names:    res.append([name for name in lst if name == "Matt"])print(res)A third alternative functional alternative using filter and partial, is to do:from operator import eqfrom functools import partialnames = [['Matt', 'Matt', 'Paul'], ['Matt']]eq_matt = partial(eq, "Matt")res = [[*filter(eq_matt, lst)] for lst in names]print(res)Micro-Benchmark%timeit [[*filter(eq_matt, lst)] for lst in names]56.3 µs ± 519 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)%timeit [[name for name in lst if "Matt" == name] for lst in names]26.9 µs ± 355 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)Setup (of micro-benchmarks)import randompopulation = ["Matt", "James", "William", "Charles", "Paul", "John"]names = [random.choices(population, k=10) for _ in range(50)]Full BenchmarkCandidatesdef nested_list_comprehension(names, needle="Matt"):    return [[name for name in lst if needle == name] for lst in names]def functional_approach(names, needle="Matt"):    eq_matt = partial(eq, needle)    return [[*filter(eq_matt, lst)] for lst in names]def count_approach(names, needle="Matt"):    return [[needle] * name.count(needle) for name in names]PlotThe above results were obtained for a list that varies from 100 to 1000 elements where each element is a list of 10 strings chosen at random from a population of 14 strings (names). The code for reproducing the results can be found here.As it can be seen from the plot the most performant solution is the one from @rv.kvetch.

Advertisement

Answer