Skip to content
Advertisement

Calculate squared deviation from the mean for each element in array

I have an array with shape (128,116,116,1), where 1st dimension asthe number of subjects, with the 2nd and 3rd being the data.

I was trying to calculate the variance (squared deviation from the mean) at each position (i.e: in (0,0), (0,1), (1,0), etc… until (116,116)) for all the 128 subjects, resulting in an array with shape (116,116).

Can anyone tell me how to accomplish this?

Thank you!

Advertisement

Answer

Let’s say we have a multidimensional list a of shape (3,2,2)

import numpy as np
a =
[
    [
        [1,1],
        [1,1]
    ],
    [
        [2,2],
        [2,2]
    ],
    [
        [3,3],
        [3,3]
    ],
]

np.var(a, axis = 0) # results in:
> array([[0.66666667, 0.66666667],
>        [0.66666667, 0.66666667]])

If you want to efficiently compute the variance across all 128 subjects (which would be axis 0), I don’t see a way to do it using the statistics package since it doesn’t take multi-lists as input. So you will have to write your own code/logic and add loops on the subjects.
But, using the numpy.var function, we can easily calculate the variance of each ‘datapoint’ (tuples of indices) across all 128 subjects.


Side note: You mentioned statistics.variance. However, that is only to be used when you are taking a sample from a population as is mentioned in the documentation you linked. If you were to go the manual route, you would use statistics.pvariance instead, since we are calculating it on the whole dataset. The difference can be seen here:

statistics.pvariance([1,2,3])
> 0.6666666666666666 # (correct)
statistics.variance([1,2,3])
> 1 # (incorrect)
np.var([1,2,3])
> 0.6666666666666666 # (np.var also gives the correct output)
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement