I am trying to trace to what extent is listA, listB, listC… similar to the original list. How do I print the number of elements that occur in the same sequence
in listA
as they occur in the original list?
original_list = ['I', 'live', 'in', 'space', 'with', 'my', 'dog'] listA = ['my', 'name', 'my', 'dog', 'is', 'two', 'years', 'old'] listB = ['how', 'where', 'I', 'live', 'in', 'space', 'with'] listC = ['I', 'live', 'to', 'the' 'in', 'space', 'with', 'my', 'football', 'my','dog'] Output: listA: Count = 2 #'my', 'dog' listB: Count = 5 #'I', 'live', 'in', 'space', 'with' listC: Count = 2,4,2 #'I', 'live' #'in', 'space', 'with', 'my' #'my', 'dog'
Advertisement
Answer
I wrote a function that does the job I think. It might be a bit too complex, but I can’t see an easier way at the moment:
original = ['I', 'live', 'in', 'space', 'with', 'my', 'dog'] listA = ['my', 'name', 'my', 'dog', 'is', 'two', 'years', 'old'] listB = ['how', 'where', 'I', 'live', 'in', 'space', 'with'] listC = ['I', 'live', 'to', 'the', 'in', 'space', 'with', 'my', 'football', 'my', 'dog'] def get_sequence_lengths(original_list, comparative_list): original_options = [] for i in range(len(original_list)): for j in range(i + 1, len(original_list)): original_options.append(original_list[i:j + 1]) comparative_options = [] for i in range(len(comparative_list)): for j in range(i+1, len(comparative_list)): comparative_options.append(comparative_list[i:j+1]) comparative_options.sort(key=len, reverse=True) matches = [] while comparative_options: for option in comparative_options: if option in original_options: matches.append(option) new_comparative_options = comparative_options.copy() for l in comparative_options: counter = 0 for v in option: counter = counter + 1 if v in l else 0 if counter == len(l): new_comparative_options.remove(l) break comparative_options = new_comparative_options break if option == comparative_options[-1]: break matches = [option for option in original_options if option in matches] lengths = [len(option) for option in matches] print(lengths) print(matches) return lengths
If you call it with the original list and example lists, it prints the following.
get_sequence_lengths(original, listA)
prints [2] [['my', 'dog']]
.
get_sequence_lengths(original, listB)
prints [5] [['I', 'live', 'in', 'space', 'with']]
.
get_sequence_lengths(original, listC)
prints [2, 4, 2] [['I', 'live'], ['in', 'space', 'with', 'my'], ['my', 'dog']]
.