I am using this code to compute all possible sum of a n x m
matrix. The code is working absolutely fine and it is fast too when using arrays of 32-bit integers like [[777,675,888],[768,777,698]]
. It is using the Numpy package. However, as soon as I use 128-bit integers or bigger, I start getting negative values.
Its working fine with integers like 888888888888888
. All I need to do is to set the datatype as np.int64
, but for larger values none of the data types are working.
I am using Numpy because of speed. And another reason is that I am not that perfect in programming. Any solution to this? Is it possible to define custom datatype in Numpy or using python default datatype in this code?
Here is the code:
import numpy as np data = np.array([[1999999999999999998,2999999999999999998,3999999999999999998],[4999999999999999998,5999999999999999998,6999999999999999998]]) print(data) div = int(data.shape[0]) row_len_squared = int(data.shape[1]**2) firstPossibleSumsArray = np.empty(int((div * (div - 1)) / 2 * row_len_squared), dtype=int) idx = 0 for row in range(div): for col in range(row + 1, div): firstPossibleSumsArray[idx:idx+row_len_squared] = (data[row,:,np.newaxis] + data[col]).flatten() idx += row_len_squared #reapeat process for second possible sums array by replacing the range #in the first loop from range(div) to range(div,2*div) print(firstPossibleSumsArray)
Advertisement
Answer
The source of the negative value is coming from integer overflows. If you want to prevent overflows, you should use sufficiently big integers. Beyond 64 bits, Numpy only support unbounded Python integers (which are not very fast). You can enable this with dtype=object
.
Here the corrected code:
import numpy as np data = np.array([[1999999999999999998,2999999999999999998,3999999999999999998],[4999999999999999998,5999999999999999998,6999999999999999998]], dtype=object) print(data) div = int(data.shape[0]) row_len_squared = int(data.shape[1]**2) firstPossibleSumsArray = np.empty(int((div * (div - 1)) / 2 * row_len_squared), dtype=object) idx = 0 for row in range(div): for col in range(row + 1, div): firstPossibleSumsArray[idx:idx+row_len_squared] = (data[row,:,np.newaxis] + data[col]).flatten() idx += row_len_squared #reapeat process for second possible sums array by replacing the range #in the first loop from range(div) to range(div,2*div) print(firstPossibleSumsArray)
Output:
[[1999999999999999998 2999999999999999998 3999999999999999998] [4999999999999999998 5999999999999999998 6999999999999999998]] [6999999999999999996 7999999999999999996 8999999999999999996 7999999999999999996 8999999999999999996 9999999999999999996 8999999999999999996 9999999999999999996 10999999999999999996]