I try to filter correlation matrix with p-value for the following matrix
JavaScript
x
8
1
import numpy as np
2
from scipy.stats.stats import pearsonr
3
A=np.array([[ 6.02, 5.32],
4
[12.18, 12.13],
5
[11.08, 10.54],
6
[ 9.03, 8.95],
7
[ 6.08, 6.94]])
8
I use the following code
JavaScript
1
28
28
1
def get_corr(M, g=1):
2
3
n =np.shape(M)[0]
4
out = np.empty(np.shape(M)[0])
5
out_p = np.empty(np.shape(M)[0])
6
7
out1 = np.zeros(shape=(np.shape(M)[0],np.shape(M)[0]))
8
P1 = np.zeros(shape=(np.shape(M)[0],np.shape(M)[0]))
9
for p in range(np.shape(M)[0]):
10
for i in range(np.shape(M)[0]):
11
12
PearsonCorrCoeff, pval = pearsonr(M[p,:], M[i,:])
13
aux = PearsonCorrCoeff
14
out_p[i]= pval
15
out[i] = 0 if np.isnan(aux) else aux
16
if g==1:
17
if pval < (0.01):#/N:
18
aux = aux
19
else:
20
aux = 0
21
out[i] = 0 if np.isnan(aux) else aux
22
else:
23
out[i] = 0 if np.isnan(aux) else aux
24
out1[p] = out
25
P1[p] = out_p
26
return out1,P1
27
corr_A, P_A = get_corr(A)
28
But the answer that I get it is strange, because the main correlation without filtering is
JavaScript
1
6
1
corr_A=array([[ 1., -1., 1., -1., 1.],
2
[-1., 1., -1., 1., -1.],
3
[ 1., -1., 1., -1., 1.],
4
[-1., 1., -1., 1., -1.],
5
[ 1., -1., 1., -1., 1.]])
6
and the P-value matrix is
JavaScript
1
6
1
P_A=array([[1., 1., 1., 1., 1.],
2
[1., 1., 1., 1., 1.],
3
[1., 1., 1., 1., 1.],
4
[1., 1., 1., 1., 1.],
5
[1., 1., 1., 1., 1.]])
6
while all should be zero, I do not know what could be the reason, has someone had the same problem before?
Advertisement
Answer
To elaborate on what @Marat’s comment, you likely want:
JavaScript
1
2
1
pearsonr(M[:,p], M[:,i])
2
Why is -1/1 what you’d expect here? Think about the case where x
and y
are just two values apiece, think about fitting a best fit line through a graph of these values. Something like:
JavaScript
1
13
13
1
import numpy as np
2
import matplotlib.pyplot as plt
3
4
A = np.random.randn(2,2)
5
6
x = A[0]
7
y = A[1]
8
9
ax = plt.plot(x,y, "-o")
10
ax[0].axes.set(xlabel="x", ylabel="y")
11
None
12
13
So not too shabby!
You’re probably expecting someting like this:
JavaScript
1
15
15
1
import numpy as np
2
import matplotlib.pyplot as plt
3
from scipy.stats import pearsonr
4
5
B = np.random.randn(2,300)
6
7
x = B[0]
8
y = B[1]
9
10
print(pearsonr(x,y))
11
12
ax = plt.plot(x,y, "o")
13
ax[0].axes.set(xlabel="x", ylabel="y", title="With >two values")
14
None
15
As expected, not much of a correlation.