Thanks for the comments! Nice to see this PR.
That’s exactly what I did. However, I had to add something like
z = zero(TY)
@inbounds for i in eachindex(Y)
Y[i] = z
end
because I can’t assume that Y is initalizied with zeros (I’m applying A_mul_B! multiple times).