Skip to content
Advertisement

Numpy multiplication using * (asterisk) returning wrong values when using named variables

I am running into a problem using the operator * with numpy scalars, and it would be great if someone can explain what is going on.

Basically, I needed to multiply the sums of columns and rows from various dataframes, and the easiest way to do that was to assign each aggregate to a variable, and then multiply those variables together.

The following block of code demonstrates the problem:

JavaScript

Then, I multiply the resulting sums using both hardcoded and variable approaches and receive two different results:

JavaScript

I’m guessing this has something to do with how numpy treats the * operator, but I have not been able to find a definitive explanation about what is going on and how to avoid this problem.

Note that the following workaround that removes numpy from the question returns 1 as expected:

JavaScript

Thanks in advance!

Advertisement

Answer

The problem is that you are using fixed width integers (int64) that are capped in the minimum and maximum values they can hold, and you are trying to represent a number larger than what can be represented (integer overflow). You could either use variable size integers (like big int that Python uses) or you could switch to floats which trade off some precision for larger minimum and maximum values they can represent.

Practically, you can just force the _sum variables to be treated as float before overflowing:

JavaScript

With this you can observe that the following:

JavaScript

will print a value of 1.0.

Note that such apparently exact result is a result of this specific calculation and how numbers get converted.

In general, results obtained with float arithmetic and big int arithmetic will be different, e.g.:

JavaScript
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement