Dispatch on parametric type

I’m writing an application where I’ve found it convenient to dispatch on the parameters of some structs. Something like the following:

abstract type MyType end
abstract type A <: MyType end
abstract type B <: MyType end

struct MyStruct1{MT<:MyType,T}
    val1 :: T
    val2 :: T
    # ...
end
MyStruct1{MT}(x,y) where {MT<:MyType} = MyStruct1{MT,typeof(x)}(x,y)

struct MyStruct2{MT<:MyType,T}
   ab :: T
   cd :: T
    # ...
end
MyStruct2{MT}(x,y) where {MT<:MyType} = MyStruct2{MT,typeof(x)}(x,y)

func1(foo::MyStruct1{A}) = foo.val1 + foo.val2
func1(foo::MyStruct1{B}) = foo.val1 * foo.val2

func2(foo::MyStruct2{A}) = 42
func2(foo::MyStruct2{B}) = 666

This allows me to dispatch on the parameter MyType as follows:

julia> fooA = MyStruct1{A}(1,2)
julia> fooB = MyStruct1{B}(1,2)
julia> fooC = MyStruct2{A}(1,2)
julia> fooD = MyStruct2{B}(1,2)

julia> func1(fooA)
3

julia> func1(fooB)
2

julia> func2(fooC)
42

julia> func2(fooD)
666

This seems to be working fine for me, and I’ve noticed that there have been similar questions, like here: Dispatch function based on value of parametric type

However, I’ve noticed the section on the manual warning on the dangers of abusing multiple dispatch and I’m now wondering if this applies to my construction above… My actual use case has much more complicated functions and each struct has a lot more elements. Is there anything I should be concerned with in the construction above?

1 Like

The manual advises caution when dispatching on values-as-parameters (see here), which is not what you are doing. What you are doing is dispatching on a parameter that is a type (rather than an isbits value), which is perfectly normal. However, you could probably change this

MyStruct1{MT}(x,y) where {MT<:MyType} = MyStruct1{MT,typeof(x)}(x,y)

to this:

MyStruct1{MT}(x::T, y::T) where {MT<:MyType,T} = MyStruct1{MT,T}(x, y)

Also, does your # ... section contain a struct field that looks like val3::MT? One thing you do want to avoid is declaring fields as abstract types. (The only subtypes of MyType shown in your example are abstract types.)

EDIT:
Ok, it looks like you’re using the abstract types A and B just for dispatching, not as the types of any of the fields in the struct. I think people do that, and I guess it’s ok. I’m not an expert on the pros and cons of that approach…

2 Likes

Indeed, abstract types A and B are not types of any of the fields, I just use them for dispatching purposes. I haven’t found many examples of people using this approach, so that’s also why I was wondering if this is ok or if there are potential problems…

I don’t think there are any technical issues with doing that, and as far as I know there are no performance issues. However, I think it’s a technique that is best used in moderation. When in doubt, keep it simple. I would lean toward making two separate concrete types, say MyStructA and MyStructB. Or depending on the scenario, maybe func1(foo::MyStruct1{A}) and func1(foo::MyStruct1{B}) should be two separate functions like asdf() and qwer(). If their functionality is different enough, then they should probably have different names.

A third possible approach would be something like this:

func(::Type{A}, foo::MyStruct) = # ...
func(::Type{B}, foo::MyStruct) = # ...

Then you could call func(A, x) or func(B, x) to get different behaviors.

Wouldn’t this be somewhat equivalent to my original approach? The reason I’d keep it is that for my use case I need to loop over arrays with MyStruct as element and act on them with func. So I can then do something like

fooA  = MyStruct1{A}(1,2)
fooB  = MyStruct1{B}(1,2)
array = [fooA, fooB]
sol   = [func1(array[i]) for i in eachindex(array)]

I realize I could also split MyStruct into two different types, but for my actual use case this struct has a lot of fields, so the original approach saves me a lot of repeated code…

[EDIT: I had a bug in my code, so I’ve updated the results and discussion.]

Ok, I did some benchmarking of three different approaches to test the speed of dispatch on an array of structs:

# --- Switch by type parameter. ---

abstract type MyType end
abstract type A <: MyType end
abstract type B <: MyType end

struct MyStruct{MT<:MyType,T}
    x::T
    y::T
end

MyStruct{MT}(x::T, y::T) where {MT<:MyType,T} = MyStruct{MT,T}(x, y)
MyStruct(::Type{MT}, x::T, y::T) where {MT<:MyType,T} = MyStruct{MT,T}(x, y)

func(s::MyStruct{A}) = s.x + s.y
func(s::MyStruct{B}) = s.x * s.y

function randstructMT(n)
    switch = rand([A, B], n)
    x, y = rand(n), rand(n)
    MyStruct.(switch, x, y)
end


# --- Switch by concrete type. ---

struct MyStructA{T}
    x::T
    y::T
end

struct MyStructB{T}
    x::T
    y::T
end

func(s::MyStructA) = s.x + s.y
func(s::MyStructB) = s.x * s.y

function randstructAB(n)
    switch = rand(Bool, n)
    x, y = rand(n), rand(n)
    [switch ? MyStructB(x, y) : MyStructA(x, y) for (switch, x, y) in zip(switch, x, y)]
end


# --- Switch by field value. ---

struct MyStructSwitch{T}
    x::T
    y::T
    switch::Bool
end

function func(s::MyStructSwitch)
    if s.switch
        s.x * s.y
    else
        s.x + s.y
    end
end

function randstructswitch(n)
    switch = rand(Bool, n)
    x, y = rand(n), rand(n)
    MyStructSwitch.(x, y, switch)
end


# --- Benchmarks. ---

using BenchmarkTools

a_MT = randstructMT(1000)
a_AB = randstructAB(1000)
a_Switch = randstructswitch(1000)

# warmup
func(a_MT[1]);
func(a_AB[1]);
func(a_Switch[1]);

@btime func.($a_MT);
@btime func.($a_AB);
@btime func.($a_Switch);

Results:

julia> @btime func.($a_MT);
  1.625 μs (1 allocation: 7.94 KiB)

julia> @btime func.($a_AB);
  45.009 μs (1494 allocations: 31.30 KiB)

julia> @btime func.($a_Switch);
  3.211 μs (1 allocation: 7.94 KiB)

Your approach performs the fastest! I can’t say my knowledge of Julia internals is very strong, so I can’t really explain the performance differences between the various approaches. In particular, I’m surprised that the MyStructA / MyStructB approach is so much slower, and I don’t understand why it has so many more allocations.

I ran the benchmarks again, but this time I changed the function names for the three different implementations so that the different implementations are not sharing the same method table for func. Here are the results:

julia> @btime func1.($a_MT);
  5.062 μs (1 allocation: 7.94 KiB)

julia> @btime func2.($a_AB);
  44.826 μs (1494 allocations: 31.30 KiB)

julia> @btime func3.($a_Switch);
  3.575 μs (1 allocation: 7.94 KiB)

In this case the approach with the if-else switch appears to be slightly faster, though still comparable to your approach.

We recently had a similar discussion here:

The TL;DR is that it is better to use explicit fields with singleton instances of structs instead of abstract types for dispatch because it composes better, is extendable, and easier to understand, e.g.

abstract type MyType end
struct AType <: MyType end
const A = AType()
struct BType <: MyType end
const B = BType()

struct MyStruct1{MT<:MyType,T}
    field::MT
    val1::T
    val2::T
    # ...
end
3 Likes

Thank you all for the feedback and careful analysis. Indeed, using singleton types seems like a very nice alternative.

1 Like

This made me curious. What is the advantage of explicitly declaring a singleton field in a struct?

I was thinking of singleton type dispatch without an explicit field and with a function-based API:

abstract type MyType end
struct AType <: MyType end
...

struct MyStruct1{MT<:MyType,T}
    val1::T
    val2::T
    # ...
    MyStruct1(::MT, args::T...) where {MT, T} = MyStruct1{MT, T}(args...)
end

mtype(::MyStruct1{MT,T}) = MT()

Is that synonymous to declaring a field of type MT in the structure or am I missing some gotcha?
(looking at the implementation of priority queues in DataStructures.jl, the explicit declaration of an ::Ordering field is the choice there as well)

Working with values instead of types is a bit more convenient syntactically and has no performance cost. Also, it makes it very easy to add actual fields to some of those types if the need arises.

Generally, when there is a choice between operating in value vs type space in Julia, it is usually advisable to go with values.

1 Like

Ah, the extensibility of a field to non-singleton types if needed is what I’ve overlooked.