I wrote the test according to an approach I found. When looking in Stack Overflow I saw another approach (can be seen here) which was a little more complicated, and made me wonder if I chose the right one.
I’m looking for ways to check if my calculation is correct.
Here is the relevant code:
JavaScript
x
35
35
1
from scipy.stats import chi2_contingency
2
import pandas as p
3
4
# Example data
5
6
data[['Eczema', 'Gender']]
7
8
Eczema Gender
9
1 Healthy 0
10
4 Healthy 1
11
5 Healthy 0
12
6 Healthy 1
13
8 Healthy 1
14
..
15
601 Healthy 0
16
603 Healthy 0
17
604 Healthy 1
18
606 Diseased 1
19
607 Healthy 1
20
21
# The contingency table:
22
23
p.crosstab(data['Eczema'], data['Gender'])
24
25
Gender 0 1
26
Eczema
27
Diseased 5 11
28
Healthy 219 233
29
30
# The calculation:
31
32
chi2, p, dof, ex = chi2_contingency(p.crosstab(data['Eczema'], data['Gender']))
33
p
34
0.27176974714995455
35
Any suggestions will be welcomed. Thanks!
Advertisement
Answer
The other approach that you linked to is not actually a different method. The code in that question attempted to do the same calculations as those in chi2_contingency
, but it had some mistakes.
Your code looks fine. With a p-value of 0.27, one would say that the data does not support rejecting the null hypothesis of no association between Eczema and Gender.