Compute sum of power of large sparse matrix

Question

Given a query vector (one-hot-vector) q with size of 50000&#215;1 and a large sparse matrix A with size of 50000 x 50000 and nnz of A is 0.3 billion, I want to compute r=(A + A^2 + &#8230; + A^S)q (usually 4 <= S <=6). I can above equation iteratively using loop but I want to more fast method. First

Accepted Answer

Even if the matrix is not sparse this the iterative method is the way to go.Multiplying A.dot(q) has complexity O(N^2), while computing A.dot(A^i) has complexity O(N^3).The fact that q is sparse (indeed much more sparse than A) may help.For the first iteration A*q can be computed as A[q_hot_index,:].T.For the second iteration the expected density of A @ q has the same expected density as A for A (about 10%) so it is still good to do it sparsely.For the third iteration afterwards the A^i @ q will be dense.Since you are accumulating the result it is good that your r is not sparse, it prevents index manipulation.There are several different ways to store sparse matrices. I myself can&#8217;t say I understand in depth all of them, but I think csr_matrix, csc_matrix, are the most compact for generic sparse matrices.Eigen decomposition is good when you need to compute a P(A), to compute a P(A)*q the eigen decomposition becomes advantageous only when P(A) has degree of the order of the size of A. Eigen-decomposition has complexity O(N^3), a matrix-vector product has complexity O(N^2), the evaluation of a polynomial P(A) of degree D using the eigen decomposition can be achieved in O(N^3 + N*D).Edit: Answering questionss on the comments&#8220;it prevents index manipulation&#8221; <- Could you elaborate this?Suppose you have a sparse matrix [0,0,0,2,0,7,0]. This could be described as ((3,2), (5,7)). Now suppose you assigne 1 to one element and it becomes [0,0,0,2,1,7,0], it is now represented as ((3,2), (4,1), (5,7)). The assignment is performed by insertion in an array, and inserting in an array has complexity O(nnz), where nnz is the number of nonzero elements. If you have a dense matrix you can always modify one element with complexity O(1).What is the N in the complexity?It is the number of rows, or columns, of the matrix AAbout the eigen decomposition, do you want to say that it is worththat computing r can be achieved in O(N^3 +N*D) not O(N^3 + N^2)Computing P(A) will have complexity O(N^3 * D) (with a different constant), for big matrices, computing P(A) using the eigen decomposition is probably the most efficient. But P(A)x have O(N^2 * D) complexity, so it is probably not a good idea to compute P(A)x with eigen decomposition unless you have big D (>N), when speed is concerned.

Advertisement

Answer