[ANN]: PaddedMatrices.jl, Julia BLAS and partially sized arrays

@Elrod, You are amazing!
I remember when we talked about SIMD and the trick about padding Matrices so each row will number of elements which is a multiplication of the SIMD width which brought PaddedMatrices. You made miracles with that trick. Bravo!

Could you tell at what matrix size OpenBLAS / MKL starts using Multi Threading?
I also think it would be nice to compare to MKL’s JIT feature for small matrices.

With Julia synthetic sugar you can create broadcasting of Matrix Multiplication in a style like Packed GEMM and Batch GEMM.

By the way, I think Images could benefit a lot from this package.
The approach of padding is widely used in this area and it solves the issue with taking care of aligned load and the tail of each row.

1 Like