This is a general question. Suppose we have a sparse CSR matrix A
and a dense column majour matrix B
, and we want to calculate the product.
In principle, one can parallelize either over the rows of matrix A
, or over the columns of matrix B
. Given the dimensions of the matrices and may be some additional information (sparsity of A
, for example), which way is more efficient? Does anyone know, how it is done in MKL, for example?