Hello,
I am trying to understand why I obtain allocations when using a multiple dispatch version of a function, compared to specialized functions. In the code below, I define different two structs that contain a number of arrays. (In practice, I will have many more structs than this). I then have code that calculates the total size of all arrays contained within a struct.
using BenchmarkTools
struct x_t{T <: Real}
x1::Array{T, 2}
x2::Array{T, 2}
x3::Array{T, 1}
x4::Array{T, 2}
x5::Array{T, 2}
x6::Array{T, 3}
x7::Array{T, 2}
end
struct y_t{T <: Real}
y1::Array{T, 2}
y2::Array{T, 3}
end
function x_t(T)
return x_t(Array{T, 2}(undef, 10, 5),
Array{T, 2}(undef, 20, 5),
Array{T, 1}(undef, 20),
Array{T, 2}(undef, 120, 9),
Array{T, 2}(undef, 200, 11),
Array{T, 3}(undef, 90, 25, 9),
Array{T, 2}(undef, 10, 15)
)
end
function y_t(T)
return y_t(Array{T, 2}(undef, 10, 4),
Array{T, 3}(undef, 10, 10, 20))
end
function calcLength(xy)
total_length = 0
for field_name in fieldnames(typeof(xy))
field = getfield(xy, field_name)
if isa(field, AbstractArray)
total_length += length(field)
end
end
return total_length
end
function calcLength2(x::x_t)
total_length = 0
for field_name in fieldnames(typeof(x))
field = getfield(x, field_name)
if isa(field, AbstractArray)
total_length += length(field)
end
end
return total_length
end
function calcLength2(y::y_t)
total_length = 0
for field_name in fieldnames(typeof(y))
field = getfield(y, field_name)
if isa(field, AbstractArray)
total_length += length(field)
end
end
return total_length
end
x = x_t(Float64)
y = y_t(Float64)
println(calcLength(x))
println(calcLength(y))
println(calcLength2(x))
println(calcLength2(y))
@btime calcLength(x)
@btime calcLength(y)
@btime calcLength2(x)
@btime calcLength2(y)
The output of this (Julia 1.9.0) is
23850
2040
23850
2040
127.265 ns (8 allocations: 464 bytes)
43.512 ns (3 allocations: 80 bytes)
96.447 ns (0 allocations: 0 bytes)
42.747 ns (0 allocations: 0 bytes)
As can be seen, if I rely on multiple dispatch, the code is allocating memory, whereas if I specialize the code to the two types of structs, it does not. I would like to understand why this is, and if it can be avoided. Thanks.