I am learning Python, and have encountered numpy.sum. It has an optional parameter axis. This parameter is used to get either column-wise summation or row-wise summation. When axis = 0 we imply to sum it over columns only. For example,
a = np.array([[1, 2, 3], [4, 5, 6]]) np.sum(a, axis = 0)
This snippet of code produces output: array([5, 7, 9]), fine. But if I do:
a = np.array([1, 2, 3]) np.sum(a, axis = 0)
I get result: 6, why is that? Shouldn’t I get array([1, 2, 3])?
Advertisement
Answer
All that is going on is that numpy is summing across the first (0th) and only axis. Consider the following:
In [2]: a = np.array([1, 2, 3])
In [3]: a.shape
Out[3]: (3,)
In [4]: len(a.shape) # number of dimensions
Out[4]: 1
In [5]: a1 = a.reshape(3,1)
In [6]: a2 = a.reshape(1,3)
In [7]: a1
Out[7]:
array([[1],
[2],
[3]])
In [8]: a2
Out[8]: array([[1, 2, 3]])
In [9]: a1.sum(axis=1)
Out[9]: array([1, 2, 3])
In [10]: a1.sum(axis=0)
Out[10]: array([6])
In [11]: a2.sum(axis=1)
Out[11]: array([6])
In [12]: a2.sum(axis=0)
Out[12]: array([1, 2, 3])
So, to be more explicit:
In [15]: a1.shape Out[15]: (3, 1)
a1 is 2-dimensional, the “long” axis being the first.
In [16]: a1[:,0] # give me everything in the first axis, and the first part of the second Out[16]: array([1, 2, 3])
Now, sum along the first axis:
In [17]: a1.sum(axis=0) Out[17]: array([6])
Now, consider a less trivial two-dimensional case:
In [20]: b = np.array([[1,2,3],[4,5,6]])
In [21]: b
Out[21]:
array([[1, 2, 3],
[4, 5, 6]])
In [22]: b.shape
Out[22]: (2, 3)
The first axis is the “rows”. Sum along the rows:
In [23]: b.sum(axis=0) Out[23]: array([5, 7, 9])
The second axis are the “columns”. Sum along the columns:
In [24]: b.sum(axis=1) Out[24]: array([ 6, 15])