I wrote the test according to an approach I found. When looking in Stack Overflow I saw another approach (can be seen here) which was a little more complicated, and made me wonder if I chose the right one.
I’m looking for ways to check if my calculation is correct.
Here is the relevant code:
from scipy.stats import chi2_contingency import pandas as p ... # Example data data[['Eczema', 'Gender']] Eczema Gender 1 Healthy 0 4 Healthy 1 5 Healthy 0 6 Healthy 1 8 Healthy 1 .. ... ... 601 Healthy 0 603 Healthy 0 604 Healthy 1 606 Diseased 1 607 Healthy 1 # The contingency table: p.crosstab(data['Eczema'], data['Gender']) Gender 0 1 Eczema Diseased 5 11 Healthy 219 233 # The calculation: chi2, p, dof, ex = chi2_contingency(p.crosstab(data['Eczema'], data['Gender'])) p 0.27176974714995455
Any suggestions will be welcomed. Thanks!
Advertisement
Answer
The other approach that you linked to is not actually a different method. The code in that question attempted to do the same calculations as those in chi2_contingency
, but it had some mistakes.
Your code looks fine. With a p-value of 0.27, one would say that the data does not support rejecting the null hypothesis of no association between Eczema and Gender.