Efficiently accumulating a collection of sparse scipy matrices

Question

I&#8217;ve got a collection of O(N) NxN scipy.sparse.csr_matrix, and each sparse matrix has on the order of N elements set. I want to add all these matrices together to get a regular NxN numpy array. (N is on the order of 1000). The arrangement of non-zero elements within the matrices is such that the resulti…

Accepted Answer

I think I&#8217;ve found a way to speed it up by a factor of ~10 if your matrices are very sparse.In [1]: from scipy.sparse import csr_matrixIn [2]: def sum_sparse(m):   ...:     x = np.zeros(m[0].shape)   ...:     for a in m:   ...:         ri = np.repeat(np.arange(a.shape[0]),np.diff(a.indptr))   ...:         x[ri,a.indices] += a.data   ...:     return x   ...: In [6]: m = [np.zeros((100,100)) for i in range(1000)]In [7]: for x in m:   ...:     x.ravel()[np.random.randint(0,x.size,10)] = 1.0   ...:             m = [csr_matrix(x) for x in m]In [17]: (sum(m[1:],m[0]).todense() == sum_sparse(m)).all()Out[17]: TrueIn [18]: %timeit sum(m[1:],m[0]).todense()10 loops, best of 3: 145 ms per loopIn [19]: %timeit sum_sparse(m)100 loops, best of 3: 18.5 ms per loop

Advertisement

Answer