Hi!
Since this is my first post, let me say that I have worked with Julia for some work and hobby projects until now, and that I really like the language and its possibilities – and its amazing community, sharing their insights and providing resources to learn. Still, I occasionally stumble upon some things which I would like to understand better, mainly to improve the design/performance of my code, but also to just learn more about Julia’s internals and programming as a whole. I hope I’ve come to the right place with this kind of question.
In this (somewhat contrived) example code, I’m not sure why the performance differs significantly between the two implementations of an accessor function (one with a specialized method for the concrete type and one without):
using BenchmarkTools
abstract type AbstractElement end
struct ConcreteElement <: AbstractElement
D::Float64
end
struct UnstableStruct
v::Vector{AbstractElement}
end
get_D_generic(el)::Float64 = el.D
get_D_specialized(el)::Float64 = el.D
get_D_specialized(el::ConcreteElement) = el.D
function copy_values_to_vector1(s::UnstableStruct)
result = zeros(length(s.v))
for i in eachindex(s.v)
result[i] = get_D_generic(s.v[i])
end
return result
end
function copy_values_to_vector2(s::UnstableStruct)
result = zeros(length(s.v))
for i in eachindex(s.v)
result[i] = get_D_specialized(s.v[i])
end
return result
end
function main()
elements = ConcreteElement.(rand(1000))
s = UnstableStruct(elements)
result1 = @btime copy_values_to_vector1($s)
result2 = @btime copy_values_to_vector2($s)
@assert result1 == result2
end
main()
which produces on my machine
12.914 μs (1001 allocations: 23.56 KiB)
1.462 μs (1 allocation: 7.94 KiB)
As far as I could gather from the docs, searching the forums, and looking at the @code_xx
macros, this is what’s going on:
- as soon as a function is called within the code, a specialized method will be compiled for the concrete types of the arguments, so the generated code for
get_D_generic(::ConcreteElement)
andget_D_specialized(::ConcreteElement)
should be exactly the same - inside the loop of
copy_values_to_vector
, the type of each element of the vector is not known, but the output type of callingget_D_...
is known to beFloat64
- if I remove the output type annotation of
get_D_specific(el)
, the performance of both versions is the same – if I remove the “generic” version ofget_D_specific
altogether (i.e. only keepget_D_specific(::ConcreteElement)
then the difference reappears
What I thought was happening is that the code using get_D_generic(el)
has to do some additional type check and/or conversion because it cannot know for sure that the field D will actually contain a float.
But my main point of confusion is, why is there the need for implementing the specialized version for ConcreteElement
with identical code? The presence of a generic “catch-all” method get_D_specialized(el)::Float64
seems to have no impact on looking up the right (manually defined) method in each loop iteration, but when using get_D_generic(el)::Float64
the loop function does not generate and pick a specialized version of it?
My current approach (if I don’t want to avoid the abstract-type container in the struct) would be to just define the methods for the individual concrete types, but just using one generic method would seem like a very convenient and sensible thing to do if all implementations have the same field D
.
I can imagine that variations of this issue have been discussed before and I just couldn’t find/understand it, so I would also be happy about a link in the right direction