I have multiple methods with the same name but different argument types. When I add a fourth method with a distinct argument type, I observe a significant increase in memory allocations during the first run. I understand that this likely increases compiler workload. I’m curious why three methods seem optimal, and what might be causing the increase in allocations with the fourth. Is it a valid approach to add extra dummy variables with different types to help the compiler handle this more efficiently?
The following toy example illustrates the behavior I see when I run more complex code and measure the time for the first run. I realize the example may not be ideal and that I might be measuring time in the wrong place, but the main point is understanding why this happens. Specifically, is the compiler designed to behave this way?
module Test
struct Test1
a::Array{Float64,1}
end
struct Test2
a::Array{Int64,1}
end
struct Test3
a::Float64
end
struct Test4
a::Int64
end
function form(a::Array{Float64,1})
return Test1(a)
end
function form(a::Array{Int64,1})
return Test2(a)
end
function form(a::Float64)
return Test3(a)
end
function form(a::Int64)
return Test4(a)
end
function solve(obj::Test1)
obj.a + obj.a
end
function solve(obj::Test2)
obj.a + obj.a
end
function solve(obj::Test3)
obj.a + obj.a
end
function solve(obj::Test4)
obj.a + obj.a
end
obj = form(fill(0.0, 10))
@time solve(obj)
end
0.003744 seconds (1.10 k allocations: 78.141 KiB, 99.49% compilation time)
I understand that if the increased allocations are simply a consequence of using the @time function, then that’s fine, and I’m open to that explanation. However, if that’s not the case, I’m still curious about why I’m seeing this significant increase in allocations. Could there be another reason for this behavior?
In the second case, I also use the @time function, but I only see one allocation.
I tested this and got the same results. This doesn’t allocate much, in a fresh REPL:
julia> struct Test1
a::Array{Float64,1}
end
julia> struct Test2
a::Array{Int64,1}
end
julia> struct Test3
a::Float64
end
julia> struct Test4
a::Int64
end
julia> function form(a::Array{Float64,1})
return Test1(a)
end
form (generic function with 1 method)
julia> function form(a::Array{Int64,1})
return Test2(a)
end
form (generic function with 2 methods)
julia> function form(a::Float64)
return Test3(a)
end
form (generic function with 3 methods)
julia> function form(a::Int64)
return Test4(a)
end
form (generic function with 4 methods)
julia> function solve(obj::Test1)
obj.a + obj.a
end
solve (generic function with 1 method)
julia> function solve(obj::Test2)
obj.a + obj.a
end
solve (generic function with 2 methods)
julia> function solve(obj::Test3)
obj.a + obj.a
end
solve (generic function with 3 methods)
julia> obj = form(fill(0.0, 10))
Test1([0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
julia> @time solve(obj)
0.000005 seconds (1 allocation: 144 bytes)
I’m guessing there’s some clever compiler optimization when there are 3 or fewer methods of a function, but I’d also be curious to know what it is & how it works.
Yes, the behavior is quite puzzling. When I add a dummy variable, the compiler seems to differentiate between the functions more effectively. Could the reason be related to the use of concrete types versus custom types?
module Test
struct Test1
a::Array{Float64,1}
end
struct Test2
a::Array{Int64,1}
end
struct Test3
a::Float64
end
struct Test4
a::Int64
end
struct Test5
a::String
end
function form(a::Array{Float64,1})
return Test1(a)
end
function form(a::Array{Int64,1})
return Test2(a)
end
function form(a::Float64)
return Test3(a)
end
function form(a::Int64)
return Test4(a)
end
function solve(obj::Test1, dummy::Array{Float64,1})
obj.a + obj.a
end
function solve(obj::Test2, dummy::Array{Int64,1})
obj.a + obj.a
end
function solve(obj::Test3, dummy::Float64)
obj.a + obj.a
end
function solve(obj::Test4, dummy::Int64)
obj.a + obj.a
end
obj = form(fill(0.0, 10))
@time solve(obj, [0.0])
end
Maybe something related to max_methods. See the help for Base.Experimental.@max_methods and Base.Experimental.@compiler_options, and try playing around with the settings experimentally, if interested.
No, that would be unnecessary and, frankly, horrible.
The compiler is designed to do lots of stuff. If you want less compiling/optimization, choose that explicitly, using the above experimental settings, or using the documented command-line options.
Solution-wise, I’m going to assume there are a finite number of possible TestN types. In that case, using a sum types package like SumTypes.jl, DynamicSumTypes.jl, Moshi.jl, etc. would remove allocations related to type instability.
Are you sure those aren’t culprits? Because if I change obj = form(fill(0.0, 10)) to
const obj = form(fill(0.0, 10))
or
obj::Test1 = form(fill(0.0, 10))
or
let obj = form(fill(0.0, 10))
@time solve(obj)
end
Then the time reported is back to
0.000003 seconds (1 allocation: 144 bytes)
All in a new session, and without removing any methods of solve.
This suggests something about module scope is making obj more dynamic than one might expect. Without const, a type assertion (i.e. “typed global” syntax) or forcing it to be a local variable, the behaviour we observe is very similar to what happens when referencing global variables.