Count number of matches in pairs of pandas dataframe rows

Question

I have been trying to count the number of times different values in a row of dataframe matches with column-wise values in other rows and provide an output. To illustrate, I have a dataframe (df_testing) as follows: I am looking to count the number of exact matches among rows for values in Col_1 to Col_4. For example, Row 0 has

Accepted Answer

You could use itertools.combinations, a dictionary comprehension and the Series constructor:from itertools import combinationsdf2 = df_testing.set_index(['SN', 'Age'])out = (pd.Series({(*a, *b): (df2.loc[a]==df2.loc[b]).sum()                  for a,b in combinations(df2.index, r=2)                  })         .rename_axis(('SN_A', 'Age_A', 'SN_B', 'Age_B'))         .reset_index(name='Matched_Count')       )output:   SN_A  Age_A  SN_B  Age_B  Matched_Count0     0     23     1     33              11     0     23     2     40              32     1     33     2     40              2

Advertisement

Answer