Probably fixed by fix :typeinfo with views (closes #25038) by rfourquet · Pull Request #25040 · JuliaLang/julia · GitHub
That’s fast, wow.
Another very annoying fact is that all the new allocation optimizations appear to work only when inlined. That is (sorry for the silly example):
@inline foo_in(n) = (n, Vector{Int}(n))
@noinline foo_ni(n) = (n, Vector{Int}(n))
function ft_in(n)
s= 0
for i= 1:n
jj,v = foo_in(i)
s+=sum(v)
end
s
end
function ft_ni(n)
s= 0
for i= 1:n
jj,v = foo_ni(i)
s+=sum(v)
end
s
end
@time ft_in(1000)
0.001948 seconds (1.00 k allocations: 3.962 MiB)
@time ft_ni(1000)
0.002083 seconds (2.00 k allocations: 3.992 MiB)
This means that multiple return values, some of which are not bitstype, are only performant when inlined.
This means that one needs to decide on API design whether to @inline, or make the API look like old-style C by demanding horrible Ref{Int}
inputs for writing the output (where the caller is hopefully re-using the Ref{Int}
). And also hope that the caller is re-using the Ref{Int} often enough to not cause additional cache-misses (because it can’t point to the stack, as in C).
I mean, we have a really nice ABI for returning bitstype-structs, please use it for immutables containing ref-fields as well!