I noticed a lot of memory allocations with IntervalArithmetic.jl on Julia 1.5.3 (it appears to be the same in 1.6) when I used a vector of Interval as an input for a custom function in a module. Interestingly, redefining the function in the REPL fixes the problem.
Here is a MWE:
julia> module MyModule
function foo(v)
w = zero.(v)
for i ∈ eachindex(w)
w[i] += v[i]*v[i]
end
return w
end
export foo
end
Main.MyModule
julia> using IntervalArithmetic, BenchmarkTools
julia> v = Interval.(rand(10_000));
julia> @btime MyModule.foo($v);
3.834 ms (60002 allocations: 1.07 MiB)
julia> function MyModule.foo(v)
w = zero.(v)
for i ∈ eachindex(w)
w[i] += v[i]*v[i]
end
return w
end
julia> @btime MyModule.foo($v);
1.809 ms (2 allocations: 156.33 KiB)
I am not sure if this is a problem with IntervalArithmetic.jl, in which case I’ll open an issue on GitHub. Indeed, I remember seeing this post; though this may be completely unrelated.
Any insights on how to prevent / fix the observed behaviour without having to redefine the functions in the REPL?
@Skoffer, will do thanks. However it does not solve the issue.
@dpsanders I agree. It feels like it is not linked to IntervalArithemtic.jl.
On the other hand, I can not reproduce it with a basic custom struct or with Measurements.jl:
julia> module MyModule
function foo(v)
w = zero.(v)
for i ∈ eachindex(w)
w[i] += v[i]*v[i]
end
return w
end
export foo
end
Main.MyModule
julia> using .MyModule, Measurements, BenchmarkTools
julia> v = measurement.(rand(10_000));
julia> @btime foo($v);
348.917 μs (2 allocations: 312.58 KiB)
julia> function MyModule.foo(v)
w = zero.(v)
for i ∈ eachindex(w)
w[i] += v[i]*v[i]
end
return w
end
julia> @btime foo($v);
348.993 μs (2 allocations: 312.58 KiB)
So perhaps it is related to IntervalArithmetic.jl?..
I had some time to investigate a bit: it seems to be a problem with the multiplication *:
module MyModule
function foo(x)
for i ∈ 1:10
x * x
end
return nothing
end
export foo
end
using .MyModule, IntervalArithmetic, BenchmarkTools
x = @interval(π);
@btime foo($x); # 1.458 μs (60 allocations: 960 bytes)
It looks like there are 6 extra allocations per multiplication. These allocations disappear after redefining the function foo.
Despite the fact that the MWE is associated with IntervalArithmetic.jl, it is still not clear wether it is a Julia issue (especially the mystery around redefining the function).
Furthermore, if one does not call foo(x) before redefining foo in the REPL then the problem persists. I guess calling foo(x) triggers compilation and then redefining the function does … something …
FWIW, there seems to be quite a difference in the output of @code_native foo(x) before and after redefining the function (assuming we triggered compilation as discussed in the previous paragraph).
Perhaps this might mean something to an expert? I can not make much sense of @code_native output .
Not sure, but could be related to the fact that for functions which end up recursing back on themselves, there’s some limits on inference than can cause you to get different results depending on the order you call things in in a given session. See e.g. this blog post. It could be that redefining the function triggers inference in a different order than the original, making it work.
If its this, in theory this could be solved with the right precompile directive, not that its necessarily easy to figure out which. Or perhaps by rewriting whatever is the offending recursive function in IntervalArithmetic to recurse in a way thats more friendly to the Julia compiler (in general, this means the arguments to each recursive call should have less deeply nested parametric types and shorter tuples than the original arguments, never the opposite).
Thanks for the feedback It seems however that for more “complicated functions” inlining the functions fails. Consider the following MWE:
module MyModule
@inline function foo(v)
w = zero.(v)
for i ∈ eachindex(w)
_foo2!(w, v, i)
end
return w
end
@inline function _foo2!(w, v, i)
for j ∈ 1:i
w[i] += v[i]*v[j]
end
end
export foo
end
using IntervalArithmetic, BenchmarkTools
v = Interval.(rand(10_000));
@btime MyModule.foo($v); # 10.147 s (300030002 allocations: 4.47 GiB)
function MyModule._foo2!(w, v, i)
for j ∈ 1:i
w[i] += v[i]*v[j]
end
end
@btime MyModule.foo($v); # 4.938 s (2 allocations: 156.33 KiB)
I also don’t have a good knowledge of how inlining works. Hopefully someone with a better understanding of inlining can step in to help us