Hello, inspired by the conclusion of JuliaCon I wanted to try to push myself to achieve consistent 0 allocation programs. There was a talk that gave recommendations, in addition to referencing the performance guide, on how to achieve this but it seems I’m still having some difficulty in going identifying extraneous allocations. What are the best resources to hunt down allocations besides the traditional @time, @btime and @allocate macros? With these tools its easy to see that functions have excessive allocations but not necessary whats causing them.
For example, in the code below Afwd! has 2 allocations where as Aadj_Afwd has 10. @code_warntype looks good. So It is not clear to me where these extra allocations are coming from. My next thought would be to use either @views or StaticArrays.jl. I don’t believe a view would speed up the Small_Matrix[:] and StaticArrays.jl are not advisable due to the size of Big_Matrix.
Any suggestions would be greatly appreciated!
using LinearAlgebra
function Afwd!(out1,Small_Matrix,Big_Matrix)
mul!(out1,Big_Matrix,Small_Matrix[:])
end
function Aadj_Afwd(Small_Matrix, Big_Matrix, m)
out1 = similar(Small_Matrix[:]);
mul!(out1,Big_Matrix,Small_Matrix[:])
out2 = similar(out1)
mul!(out2,transpose(Big_Matrix),out1)
output = reshape(out2,m)
return output
end
m = (64,64);
Small_Matrix= randn(m) + 1im.*randn(m);
Big_Matrix= randn(m.^2) + 1im.*randn(m.^2);
out1 = similar(x[:]);
@btime Afwd!(out1,Small_Matrix,Big_Matrix);
@btime Aadj_Afwd(Small_Matrix,Big_Matrix,m);
10.987 ms (2 allocations: 64.08 KiB)
22.268 ms (10 allocations: 256.41 KiB)