Skip to content
Advertisement

Python: What is an efficient way to loop over a list of strings and group substrings in the list?

Background

JavaScript

I would like to find and group the substrings in the list into a list of tuples where the first element of the tuple would be the substring and the second element would be the larger string that contains the substring. The expected output is given below

JavaScript

I’ve written the following code which achieves the desired outcome

JavaScript

Is there a more efficient way to do this? I’ll eventually need to loop over a list containing 80k strings and do the above. I appreciate any suggestions/help

Advertisement

Answer

Combining suggestions in the comments and @ZabielskiGrabriel’s answer, you can do it by first sorting the list and then comparing each element in the sorted list with those that follow it in a list comprehension:

JavaScript

Benchmarks (with supplied test list):

JavaScript

Output:

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement