Skip to content
Advertisement

Check result of chi square test on pandas columns data

I wrote the test according to an approach I found. When looking in Stack Overflow I saw another approach (can be seen here) which was a little more complicated, and made me wonder if I chose the right one.
I’m looking for ways to check if my calculation is correct.

Here is the relevant code:

from scipy.stats import chi2_contingency
import pandas as p
...
 # Example data

data[['Eczema', 'Gender']]

       Eczema  Gender
1     Healthy       0
4     Healthy       1
5     Healthy       0
6     Healthy       1
8     Healthy       1
..        ...     ...
601   Healthy       0
603   Healthy       0
604   Healthy       1
606  Diseased       1
607   Healthy       1

# The contingency table:

p.crosstab(data['Eczema'], data['Gender'])

Gender      0    1
Eczema            
Diseased    5   11
Healthy   219  233

# The calculation:

chi2, p, dof, ex = chi2_contingency(p.crosstab(data['Eczema'], data['Gender']))
p
0.27176974714995455

Any suggestions will be welcomed. Thanks!

Advertisement

Answer

The other approach that you linked to is not actually a different method. The code in that question attempted to do the same calculations as those in chi2_contingency, but it had some mistakes.

Your code looks fine. With a p-value of 0.27, one would say that the data does not support rejecting the null hypothesis of no association between Eczema and Gender.

Advertisement