Say I have the following list of list of names:
names = [['Matt', 'Matt', 'Paul'], ['Matt']]
I want to return only the “Matts” in the list, but I also want to maintain the list of list structure. So I want to return:
[['Matt', 'Matt'], ['Matt']]
I’ve something like this, but this will append everthting together in one big list:
matts = [name for namelist in names for name in namelist if name=="Matt"]
I know something like this is possible, but I want to avoid iterating through lists and appending. Is this possible?
names = [['Matt', 'Matt', 'Paul'], ['Matt']] matts = [] for namelist in names: matts_namelist = [] for name in namelist: if name=="Matt": matts_namelist.append(name) else: pass matts.append(matts_namelist)
Advertisement
Answer
Use a nested list comprehension, as below:
names = [['Matt', 'Matt', 'Paul'], ['Matt']] res = [[name for name in lst if name == "Matt"] for lst in names] print(res)
Output
[['Matt', 'Matt'], ['Matt']]
The above nested list comprehension is equivalent to the following for-loop:
res = [] for lst in names: res.append([name for name in lst if name == "Matt"]) print(res)
A third alternative functional alternative using filter
and partial
, is to do:
from operator import eq from functools import partial names = [['Matt', 'Matt', 'Paul'], ['Matt']] eq_matt = partial(eq, "Matt") res = [[*filter(eq_matt, lst)] for lst in names] print(res)
Micro-Benchmark
%timeit [[*filter(eq_matt, lst)] for lst in names] 56.3 µs ± 519 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) %timeit [[name for name in lst if "Matt" == name] for lst in names] 26.9 µs ± 355 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Setup (of micro-benchmarks)
import random population = ["Matt", "James", "William", "Charles", "Paul", "John"] names = [random.choices(population, k=10) for _ in range(50)]
Full Benchmark
Candidates
def nested_list_comprehension(names, needle="Matt"): return [[name for name in lst if needle == name] for lst in names] def functional_approach(names, needle="Matt"): eq_matt = partial(eq, needle) return [[*filter(eq_matt, lst)] for lst in names] def count_approach(names, needle="Matt"): return [[needle] * name.count(needle) for name in names]
Plot
The above results were obtained for a list that varies from 100 to 1000 elements where each element is a list of 10 strings chosen at random from a population of 14 strings (names). The code for reproducing the results can be found here. As it can be seen from the plot the most performant solution is the one from @rv.kvetch.