I have this sparse matrix operation `R*kron(I,C)*R'`

where `R`

is a sparse matrix, `I`

is the identity matrix, and `C`

is a dense square matrix. The size of these matrices can be quite large; `R`

is `2000 x 6250000`

and the kronecker product is `6250000 x 6250000`

but fortunately only the `C 2500 x 2500`

matrix needs to be stored. The product will only be a `2000 x 2000`

square matrix.

Because of the size of these matrices it is very slow to calculate the product not to mention it a naive implementation would use a ton of memory. I managed to take advantage of the structure of the kronecker product to turn the product into the sum of the products of each diagonal block i.e. `sum.(R*kron(Diagonal(circshift([1,0,...],i)),C)*R' for i=1:nblocks)`

. This is faster, parallelizable, and memory efficient but it is still prohibitively expensive to calculate.

For my use case I have to calculate that product many times. However, only the values of the `C`

matrix are changing, while the rest of the matrices are constant. Since `R`

is sparse each index of the final product should only consist of relatively few elements of `C`

. In theory it should be faster to calculate the product symbolically once and reuse it. My first impulse was to use SymEngine to make a `C`

matrix filled with symbols but its taking too long to calculate.

Does anyone have any alternative approach that will speed up this calculation?