I consider two methods presented in the example below. As expected, the second one is better since it uses only concrete types.
ABSTRACT COLLECTION
34.190 ms (999490 allocations: 15.25 MiB)
CONCRETE COLLECTION
1.307 ms (0 allocations: 0 bytes)
Is there any way to have a good performance with a more elegant code?
using BenchmarkTools
abstract type HasInt end
struct A <: HasInt
a::Int
b::Float64
end
struct B <: HasInt
a::Int
b::Float64
c::Float64
end
struct C <: HasInt
a::Int
b::Float64
c::Float64
d::Float64
end
value(obj::A) = obj.a
value(obj::B) = obj.a + 1
value(obj::C) = obj.a + 2
function createabstractcollection()
srand(77777);
arr = Vector{HasInt}()
for i in 1:1000000
t = rand(1:3)
if t == 1
push!(arr, A(i, 7.0))
elseif t == 2
push!(arr, B(i, 7.0, 7.0))
else
push!(arr, C(i, 7.0, 7.0, 7.0))
end
end
return arr
end
function sumabstractcollection(arr)
s = 0
for obj in arr
s += value(obj)
end
return s
end
struct concretecollection
A_vector::Vector{A}
B_vector::Vector{B}
C_vector::Vector{C}
end
concretecollection() = concretecollection(Vector{A}(), Vector{B}(), Vector{C}())
function createconcretecollection()
srand(77777);
arr = concretecollection()
for i in 1:1000000
t = rand(1:3)
if t == 1
push!(arr.A_vector, A(i, 7.0))
elseif t == 2
push!(arr.B_vector, B(i, 7.0, 7.0))
else
push!(arr.C_vector, C(i, 7.0, 7.0, 7.0))
end
end
return arr
end
function sumconcretecollection(arr)
s = 0
for obj in arr.A_vector
s += value(obj)
end
for obj in arr.B_vector
s += value(obj)
end
for obj in arr.C_vector
s += value(obj)
end
return s
end
function test()
println("ABSTRACT COLLECTION")
absarr = createabstractcollection()
@btime sumabstractcollection($absarr)
println("CONCRETE COLLECTION")
concrarr = createconcretecollection()
@btime sumconcretecollection($concrarr)
end
test()
Maybe using a tuple of concrete vectors instead of concretecollection might make it more elegant. For example, Your sumconcretecollection could be rewritten as:
function createconcretecollection()
srand(77777);
arr = (Vector{A}(), Vector{B}(), Vector{C}())
for i in 1:1000000
t = rand(1:3)
if t == 1
push!(arr[1], A(i, 7.0))
elseif t == 2
push!(arr[2], B(i, 7.0, 7.0))
else
push!(arr[3], C(i, 7.0, 7.0, 7.0))
end
end
return arr
end
function sumconcretecollection(arr::Tuple)
s = 0
for el in arr, obj in el
s += value(obj)
end
return s
end
Edit: Well, my code actually gives the worst speed
Another option is to manually dispatch for a few types in the abstract case.
function sumabstractcollectiondispatch(arr)
s = 0
for obj in arr
if isa(obj, A)
s += value(obj::A)
elseif isa(obj, B)
s += value(obj::B)
elseif isa(obj, C)
s += value(obj::C)
else
s += value(obj)
end
end
return s
end
You can store a concretely-typed vector of FunctionWrappers instead (essentially the same as storing a function pointer in C). Or if you want a slightly higher-level abstraction, you can try out my (unregistered) Interfaces package (which just stores a struct containing function pointers for your methods of interest, similar to the way virtual methods in C++ would work).
You could also use TypeSortedCollections.jl for this (I’m the author). It basically automates your createconcretecollection/sumconcretecollection approach:
julia> using TypeSortedCollections
julia> tsc = TypeSortedCollection(createabstractcollection());
julia> @btime mapreduce(value, +, 0, $tsc)
1.204 ms (0 allocations: 0 bytes)
500001499788
I really wish there was something like Rust Enums with fields in Julia Enums - Rust By Example together with some nice pattern-matching syntax to do handle these cases efficiently.
It is so far the single language feature that I found in other languages which I am missing now and then when using Julia.
Thank you all for your suggestions. @tkoolen, Indeed TypeSortedCollections.jl seems to be a very good solution if the array isn’t updated often. If I understood correctly, I need to build a new TypeSortedCollection each time it gets updated before I loop on it, which is time consuming.
Well, there are push! and append! methods to modify an existing TypeSortedCollection, but the concrete types that are accepted must be specified by the user or automatically determined from an input collection when the TypeSortedCollection is constructed. I.e., you could also do: