Hi folks, I observed something strange recently. Passing a distributed matrix to a function will hurt its performance? Let’s say here is task to apply a distributed matrix to some vectors like below: using Distributed using DistributedArrays using LinearAlgebra % distributed matvec multiplicatio…

Why passing a distributed matrix to functions hurts performance?

stillyslalom April 11, 2022, 6:52pm 2

For distributed matmul, you should set the number of BLAS threads per core to 1 with BLAS.set_num_threads(1), otherwise you’ll oversubscribe.

Efficient ways to implement a (distributed) matrix-matrix product?

Topic		Replies	Views
Preallocating large matrices in all cores is slower than allocating every time Performance distributed	7	1040	April 21, 2021
Distributed loop and possible memory leak Performance	0	301	April 28, 2021
Distributed sparse matrix assembly General Usage distributed	7	545	September 10, 2021
Distributed Performance Degradation New to Julia distributed	5	152	May 23, 2025
Poor Distributed performance for independent linear algebra operators Performance	9	443	January 10, 2024