I made a minimum reproducible example for a performance issue I am having with my code:
mutable struct Array_container
array_field::Array{Float64}
end
function foo(a::Array_container, b::Array_container)
a.array_field[1] - b.array_field[1]
end
function run_it_n_times(n::Int64)
a = Array_container([1.1])
b = Array_container([2.2])
for i in (1:n)
foo(a, b)
end
end
Specifically, the docs say that it is “dynamic dispatch, run-time method lookup, or a virtual call”. I don’t see any obvious type instability (I am a beginner though), or any other issues addressed in the Performance Tips section of the manual. What is wrong with this code and how do I fix its performance? Thank you!
Thank you for the reply! I don’t think that’s the problem in this case. The following code does not show those red bars:
function foobar(a::Array{Float64}, b::Array{Float64})
a[1] - b[1]
end
function run_foobar_n_times(n::Int64)
for i in (1:n)
foobar([1.1], [2.2])
end
end
Abstract field types in structs are a performance trap. Use concrete, possibly parameterized, types for the fields.
Also, do you need Array_container to be a mutable struct? That could also affect performance. Since the wrapped array is already mutable, you don’t need the outer type to be mutable also, in most cases.
BTW, the convention for type names is to use CapitalCase not underscores, so ArrayContainer.
Without fully parametrizing the Array, it remains a UnionAll type, which is not concrete (although the performance tips in the docs speaks about abstract types, I think non-concrete would be a better term)
The explanation is that the compiler cannot know the memory layout of a UnionAll type, so it has to compile code for checking the runtime type (which is concrete). That check is the dynamic dispatch in the OP.
You can use Vector{Float64} for one-dimensional arrays, it looks nicer.
Thank you very much for the answers and explanations, my Julia codebase performance finally matches that of C++!
Should something along these lines be put in the Avoid fields with abstract containers section of the docs? The docs skirt this issue, but seem to never directly address it (unless if I’m missing it), and other beginners may have similar points of confusion.
Perhaps the simple addition of the following to that section of the docs:
=============================
struct MyContainer
a::Array{Float64}
end
is bad: it’s still an abstract type since the dimensionality is not specified. Instead, consider using something like:
struct MyContainer
a::Array{Float64, 1}
end
(replace 1 with n for n-dimensions) or
struct MySimpleContainer{A<:AbstractVector}
a::A
end
=============================
This addition to that section may help a lot of people. In my case, it would have saved me quite a few hours.
Our performance tips should probably mention that Array isn’t Vector and that can be a performance issue. I’m not sure why but a ton of newcomers use Array when they really want Vector or Matrix. Maybe it’s from Matlab where those aren’t distinct?
Huh, I just tried replacing all Array{Float64, 1} with Vector{Float64} and it’s not making any noticeable performance difference in my case. What really helped was specifying the dimension. This could just be my use case though, perhaps in other cases using Vector instead makes a big difference?
Vector{T} is an alias for Array{T,1}. They are exactly the same. The problem is only when using Array makes you forget a parameter and therefore have non-static types.
Slightly off-topic, but I was digging through the Base sources to figure out how Array is defined exactly to further understand the behaviour above. But base/array.jl does not contain a definition of Array as far as I can tell, nor do any of the other julia source files. I see the Array type being referenced at the top of base/Base.jl, even before base/array.jl is included. Is the Array type defined in C code?
I think so, yes. It’s one of the few things defined that way, possibly because arrays are needed very early on in the chain, when almost none of the machinery for calling and compiling julia exists.
Coming from C++ this is a real mind-bender. The conceptually equivalent construct in C++ would be a template Array<T,N>, but there is no way to reference Arrayas a type without filling in all of the parameters.
On a project with friends we talk about the style that we will use for the code and most of them wanted Array{Float64, 1} over Vector{Float64}. I think it is because they are use to write Array{Float64, 1} in Julia as they started this way when they did not know the existence of Vector.