How to avoid repeated data movement between processes?

Note that this only ever affected the “generic” matmul routine used as a fallback for user-defined number and array types. If you are using ordinary sparse or dense arrays of 32-bit or 64-bit real or complex floating-point values, thread safety was fine even before my commit.

1 Like