I AND THIS WORKS.
At least I can see it work if the input arrays are StaticArrays (which is anyway my real usecase. There is a gain in performance, and @code_typed does clearly change length.
using StaticArrays
using BenchmarkTools
struct NotWanted end
function Base.setindex!(A::NotWanted,X,inds...) end
function foo!(o1,o2,o3,i1,i2)
o1[:] = 2*i1
o2[:] = 3(i1.+i2)
tmp = i1.+i2
o3[:] = 3tmp
return nothing
end
const n = 100
o1 = Vector{Float64}(undef,n)
o2 = Vector{Float64}(undef,n)
o3 = Vector{Float64}(undef,n)
nw = NotWanted()
i1 = @SVector [randn() for i = 1:n]
i2 = @SVector [randn() for i = 1:n]
@btime foo!(o1,o2,o3,i1,i2)
@btime foo!(nw,o2,o3,i1,i2)
@btime foo!(o1,nw,o3,i1,i2)
@btime foo!(o1,o2,nw,i1,i2)
@btime foo!(o1,nw,nw,i1,i2)
@btime foo!(nw,o2,nw,i1,i2)
@btime foo!(nw,nw,o3,i1,i2)
@btime foo!(nw,nw,nw,i1,i2)
yields
268.151 ns (0 allocations: 0 bytes)
181.903 ns (0 allocations: 0 bytes)
180.184 ns (0 allocations: 0 bytes)
179.268 ns (0 allocations: 0 bytes)
98.613 ns (0 allocations: 0 bytes)
105.870 ns (0 allocations: 0 bytes)
110.408 ns (0 allocations: 0 bytes)
16.733 ns (0 allocations: 0 bytes)
The Julia creators comment that “immutables are easier to reason about”: maybe the compiler finds it easier to prove that a stack-allocated immutable variable will never be needed…
Thanks everyone!