How to find and calculate common letters between words in pandas

Question

I have a dataset with some words in it and I want to compare 2 columns and count common letters between them. For e.g I have: And I want to have smth like that: Answer You can use a list comprehension with help of itertools.takewhile: output: NB. the logic was no fully clear, so here this stops as soon as

Accepted Answer

You can use a list comprehension with help of itertools.takewhile:from itertools import takewhiledf['Match'] = [[x for x,y in takewhile(lambda x: x[0]==x[1], zip(a,b))]               for a,b in zip(df['Col_1'], df['Col_2'])]df['Count'] = df['Match'].str.len()output:    Col_1   Col_2               Match  Count0  Heaven  Heaven  [H, e, a, v, e, n]      61    Jako   Jakob        [J, a, k, o]      42      Sm   Smart              [S, m]      23  apizza   pizza                  []      0NB. the logic was no fully clear, so here this stops as soon as there is a mistmatchIf you want to continue after a mistmatch (which doesn&#8217;t seems to fit the &#8220;pizza&#8221; example):df['Match'] = [[x for x,y in zip(a,b) if x==y]               for a,b in zip(df['Col_1'], df['Col_2'])]df['Count'] = df['Match'].str.len()output:    Col_1   Col_2               Match  Count0  Heaven  Heaven  [H, e, a, v, e, n]      61    Jako   Jakob        [J, a, k, o]      42      Sm   Smart              [S, m]      23  apizza   pizza                 [z]      1

Advertisement

Answer