Looping over different types with common behavior

IssamT · June 29, 2018, 3:22pm

I consider two methods presented in the example below. As expected, the second one is better since it uses only concrete types.

ABSTRACT COLLECTION
  34.190 ms (999490 allocations: 15.25 MiB)
CONCRETE COLLECTION
  1.307 ms (0 allocations: 0 bytes)

Is there any way to have a good performance with a more elegant code?

using BenchmarkTools

abstract type HasInt end
struct A <: HasInt
    a::Int
    b::Float64
end
struct B <: HasInt
    a::Int
    b::Float64
    c::Float64
end
struct C <: HasInt
    a::Int
    b::Float64
    c::Float64
    d::Float64
end

value(obj::A) = obj.a
value(obj::B) = obj.a + 1
value(obj::C) = obj.a + 2

function createabstractcollection()
    srand(77777);
    arr = Vector{HasInt}()
    for i in 1:1000000
        t = rand(1:3)
        if t == 1
            push!(arr, A(i, 7.0))
        elseif t == 2
            push!(arr, B(i, 7.0, 7.0))
        else
            push!(arr, C(i, 7.0, 7.0, 7.0))
        end
    end
    return arr    
end

function sumabstractcollection(arr)    
    s = 0
    for obj in arr
        s += value(obj)
    end
    return s    
end

struct concretecollection
    A_vector::Vector{A}
    B_vector::Vector{B}
    C_vector::Vector{C}
end
concretecollection() = concretecollection(Vector{A}(), Vector{B}(), Vector{C}())

function createconcretecollection()
    srand(77777);
    arr = concretecollection()
    for i in 1:1000000
        t = rand(1:3)
        if t == 1
            push!(arr.A_vector, A(i, 7.0))
        elseif t == 2
            push!(arr.B_vector, B(i, 7.0, 7.0))
        else
            push!(arr.C_vector, C(i, 7.0, 7.0, 7.0))
        end
    end
    return arr    
end

function sumconcretecollection(arr)    
    s = 0
    for obj in arr.A_vector
        s += value(obj)
    end
    for obj in arr.B_vector
        s += value(obj)
    end
    for obj in arr.C_vector
        s += value(obj)
    end
    return s
end

function test()
    println("ABSTRACT COLLECTION")
    absarr = createabstractcollection()   
    @btime sumabstractcollection($absarr)
    
    println("CONCRETE COLLECTION")
    concrarr = createconcretecollection()
    @btime sumconcretecollection($concrarr)
end

test()

favba · June 29, 2018, 3:39pm

Maybe using a tuple of concrete vectors instead of concretecollection might make it more elegant. For example, Your sumconcretecollection could be rewritten as:

function createconcretecollection()
    srand(77777);
    arr = (Vector{A}(), Vector{B}(), Vector{C}())
    for i in 1:1000000
        t = rand(1:3)
        if t == 1
            push!(arr[1], A(i, 7.0))
        elseif t == 2
            push!(arr[2], B(i, 7.0, 7.0))
        else
            push!(arr[3], C(i, 7.0, 7.0, 7.0))
        end
    end
    return arr    
end

function sumconcretecollection(arr::Tuple)
    s = 0
    for el in arr, obj in el
        s += value(obj)
    end
    return s
end

Edit: Well, my code actually gives the worst speed

danielmatz · June 29, 2018, 3:51pm

Another option is to manually dispatch for a few types in the abstract case.

function sumabstractcollectiondispatch(arr)
    s = 0
    for obj in arr
        if isa(obj, A)
            s += value(obj::A)
        elseif isa(obj, B)
            s += value(obj::B)
        elseif isa(obj, C)
            s += value(obj::C)
        else
            s += value(obj)
        end
    end
    return s
end

rdeits · June 29, 2018, 4:20pm

You can store a concretely-typed vector of FunctionWrappers instead (essentially the same as storing a function pointer in C). Or if you want a slightly higher-level abstraction, you can try out my (unregistered) Interfaces package (which just stores a struct containing function pointers for your methods of interest, similar to the way virtual methods in C++ would work).

tkoolen · June 29, 2018, 6:13pm

You could also use TypeSortedCollections.jl for this (I’m the author). It basically automates your createconcretecollection/sumconcretecollection approach:

julia> using TypeSortedCollections

julia> tsc = TypeSortedCollection(createabstractcollection());

julia> @btime mapreduce(value, +, 0, $tsc)
  1.204 ms (0 allocations: 0 bytes)
500001499788

fabiangans · June 29, 2018, 8:29pm

I really wish there was something like Rust Enums with fields in Julia Enums - Rust By Example together with some nice pattern-matching syntax to do handle these cases efficiently.

It is so far the single language feature that I found in other languages which I am missing now and then when using Julia.

tkoolen · June 29, 2018, 9:37pm

That does seem nice.

Do note that performance of the naive version is much better with Julia 0.7 because of optimizations for small unions of bitstypes:

ABSTRACT COLLECTION
  8.286 ms (0 allocations: 0 bytes)
CONCRETE COLLECTION
  1.038 ms (0 allocations: 0 bytes)

but you’re still giving up quite a bit in this case, probably due to simd.

Edit: TypeSortedCollections on Julia 9965e7cab1b599da00f493fc0481ef7eadd29a52 for good measure (currently broken on latest master due to bug in deprecated.jl:mapreduce · Issue #27871 · JuliaLang/julia · GitHub):

julia> tsc = TypeSortedCollection(createabstractcollection());

julia> @btime mapreduce(value, +, 0, $tsc)
  1.047 ms (0 allocations: 0 bytes)

IssamT · June 30, 2018, 2:40pm

Thank you all for your suggestions. @tkoolen, Indeed TypeSortedCollections.jl seems to be a very good solution if the array isn’t updated often. If I understood correctly, I need to build a new TypeSortedCollection each time it gets updated before I loop on it, which is time consuming.

tkoolen · June 30, 2018, 4:06pm

Well, there are push! and append! methods to modify an existing TypeSortedCollection, but the concrete types that are accepted must be specified by the user or automatically determined from an input collection when the TypeSortedCollection is constructed. I.e., you could also do:

using TypeSortedCollections
tsc = TypeSortedCollection{Tuple{Vector{A}, Vector{B}, Vector{C}}}()
append!(tsc, createabstractcollection())
push!(tsc, A(14, 8.0))

If you want to add a value that has a new type, then you do need to construct a new TypeSortedCollection that also supports that type.

IssamT · June 30, 2018, 4:14pm

Thank you very much. I will now go and read more the details of the package to use it properly.

Topic		Replies	Views
Performance of collections with Functions as abstract type Performance	6	304	August 18, 2022
Performance problems when dealing with deliberately type-unstable code. How to use type knowledge better? Performance	33	1998	December 30, 2020
Iteration/getindex performance of AbstractArray wrapper-types Performance	2	452	January 21, 2021
Is operating a vector that have abstract type really slow? General Usage	11	756	November 29, 2019
Sections of strings or arrays and concrete types Performance	9	403	January 8, 2021

Looping over different types with common behavior

Related topics