Skip to content
Advertisement

What does axis = 0 do in Numpy’s sum function?

I am learning Python, and have encountered numpy.sum. It has an optional parameter axis. This parameter is used to get either column-wise summation or row-wise summation. When axis = 0 we imply to sum it over columns only. For example,

a = np.array([[1, 2, 3], [4, 5, 6]])
np.sum(a, axis = 0)

This snippet of code produces output: array([5, 7, 9]), fine. But if I do:

a = np.array([1, 2, 3])
np.sum(a, axis = 0)

I get result: 6, why is that? Shouldn’t I get array([1, 2, 3])?

Advertisement

Answer

All that is going on is that numpy is summing across the first (0th) and only axis. Consider the following:

In [2]: a = np.array([1, 2, 3])

In [3]: a.shape
Out[3]: (3,)

In [4]: len(a.shape) # number of dimensions
Out[4]: 1

In [5]: a1 = a.reshape(3,1)

In [6]: a2 = a.reshape(1,3)

In [7]: a1
Out[7]: 
array([[1],
       [2],
       [3]])

In [8]: a2
Out[8]: array([[1, 2, 3]])

In [9]: a1.sum(axis=1)
Out[9]: array([1, 2, 3])

In [10]: a1.sum(axis=0)
Out[10]: array([6])

In [11]: a2.sum(axis=1)
Out[11]: array([6])

In [12]: a2.sum(axis=0)
Out[12]: array([1, 2, 3])

So, to be more explicit:

In [15]: a1.shape
Out[15]: (3, 1)

a1 is 2-dimensional, the “long” axis being the first.

In [16]: a1[:,0] # give me everything in the first axis, and the first part of the second
Out[16]: array([1, 2, 3])

Now, sum along the first axis:

In [17]: a1.sum(axis=0)
Out[17]: array([6])

Now, consider a less trivial two-dimensional case:

In [20]: b = np.array([[1,2,3],[4,5,6]])

In [21]: b
Out[21]: 
array([[1, 2, 3],
       [4, 5, 6]])

In [22]: b.shape
Out[22]: (2, 3)

The first axis is the “rows”. Sum along the rows:

In [23]: b.sum(axis=0)
Out[23]: array([5, 7, 9])

The second axis are the “columns”. Sum along the columns:

In [24]: b.sum(axis=1)
Out[24]: array([ 6, 15])
Advertisement