Working with MulAddMul

I’m trying to implement the 5-argument mul! made available in Julia v1.3 on StaticArrays, and SizedArrays in particular. I’m trying to make use of LinearAlgebra.MulAddMul that is used throughout the matrix multiplication methods found in Base. However, whenever I try to use it I end up with type instability (which is expected) and allocations. I haven’t been able to figure out how to use it without incurring allocations, even though my approach seems very similar to that in the Base methods that don’t incur any allocation when using MulAddMul.

Here’s a simple example.


N1,N2 = 50,55
A = SizedMatrix{N1,N1}(rand(N1,N2))
B = SizedMatrix{N1,N2}(rand(N1,N2))
C = SizedMatrix{N1,N2}(rand(N1,N2))
α,β = 1.0, 2.0

@inline function myadd!(C,A,B,α::Number,β::Number)
    _myadd!(C,A,B,LinearAlgebra.MulAddMul(α,β))
end

function _myadd!(C,A,B,_add::LinearAlgebra.MulAddMul)
    C .= A .* _add.alpha  .+ B .* _add.beta
end

myadd!(C,A,B,α,β)
@btime myadd!($C,$A,$B,$α,$β)
@btime _myadd!($C,$A,$B,$(LinearAlgebra.MulAddMul(α,β)))
  1.705 μs (3 allocations: 64 bytes)
  1.375 μs (0 allocations: 0 bytes)

My actual code ends up calling @generated functions that do loop-unrolling and conditionally include the required terms based on MulAddMul. Since this is all done at compile-time using MulAddMul is very useful. All the rest of my code doesn’t make any memory allocations. It’s also worth noting that no allocations are made if magic numbers are used for alpha and beta instead of variables.

What am I missing here or what I can do to avoid the unnecessary allocations? How do the methods in Base avoid this issue?