Subtypes or parametric types to define constructor-dependent functions

I have a structure of data that can be returned by two (or more) different functions, such as:

struct MyType
    x::Int
end

f1(x) = MyType(x)
f2(x, y) = MyType(x - y)

Now, I want another function g to take such objects but do different things with them, depending on how they were created, while other functions like h don’t care about that, e.g.:

g(z) = z.x      # (if `z` were created as `z = f1(x)`)
g(z) = abs(z.x) # (if `z` were created as `z = f2(x, y)`)
h(z) = -z.x     # no matter how `z` was created.

An obvious solution is to use different types and multiple dispatch, as:

abstract type MyType end

struct MyType1 <: MyType
    x::Int
end

struct MyType2 <: MyType
    x::Int
end

f1(x) = MyType1(x)
f2(x, y) = MyType2(x - y)

g(z::MyType1) = z.x
g(z::MyType2) = abs(z.x)
h(z::T) where T <: MyType = -z.x

However, this is not very convenient if the data structures are complex (e.g. they have many fields), and I want to define many variants of them.

If the underlying structure is always the same, and the only thing that changes is how the constructors and other functions operate on the objects, I can think of two ways of making the code shorter.

Option 1: define a macro to create the subtypes.

abstract type MyType end

macro MyType(TypeName)
    quote
        struct $TypeName <: MyType
            x::Int
        end
    end
end

# And now make as many subtypes as wanted, with only one line each:

@MyType MyType1
@MyType MyType2

# The rest is the same...

Option 2: use parametric types

struct MyType{P}
    x::Int
end

f1(x) = MyType{1}(x)
f2(x, y) = MyType{2}(x - y)

g(z::MyType{1}) = z.x
g(z::MyType{2}) = abs(z.x)
h(z::T) where T <: MyType = -z.x

(The fact that I have defined the parameter P to be integers is irrelevant.)

Now, my question: is there any particular advantage or disadvantage of either approach? Which one is more recommendable, or is there a better solution for this?

Adding a new field to the structure, and using it to distinguish which was the constructor called (i.e., each constructor set this field to a distinct value) is out of the table? The ways you posted will generate new copies of each method, what can be what you want (i.e., have an optimized method for each version of the type), but can also be some overhead (if you wanted a more fine grained control about this, you may use a external and simple parametric type, like Val, as an additional parameter to certain methods, to control which version of the method will be called).

Do you mean something like (following my previous example):

struct MyType{T}
    dummy::T
    x::Int 
end

f1(x) = MyType(Val(1), x)
f2(x, y) = MyType(Val(2), x - y)

g(z::MyType{Val{1}}) = z.x
g(z::MyType{Val{2}}) = abs(z.x)

It if is that, I don’t see a big difference with the option #2 mentioned in the original post. And is not that additional field also a (small) overhead?

Hi,
If you have a type which needs to be treated differently depending on how it was created it usually means that you actually have two types and it is better in the long run to just accept this as a fact, create two types and move on.

I think the suggestion was to use something like holy traits but as a separate member and just dispatch on it when needed:


struct Trait1 end
struct Trait2 end

struct MyType
    trait
    x::Int 
end

f1(x) = MyType(Trait1(), x)
f2(x-y) = MyType(Trait2(), x-y)

g(z::MyType) = g(z.trait, z.x)
g(::Trait1, x) = x
g(::Trait2, x) = abs(x) 

This should be (almost) zero cost afaik (I’m a newbie so I’m just parroting what others have said though) despite looking like some overhead.

A more traditional way to avoid duplication in this case is to use wrapper types:

struct MyBaseType
   x::Int
end

struct MyType1
    b::MyBaseType
end

struct MyType2
   b::MyBaseType
end

f(x) = MyType1(MyBaseType(x))
f(x,y) = MyType2(MyBaseType(x-y))

g(z::MyType1) = z.b.x
g(z::MyType2) = abs(z.b.x)

Not sure if this approach has any performance downsides over the former though.

3 Likes

Thanks for the clarification. I still have a doubt about the addition of the trait. In this definition:

struct MyType
    trait
    x::Int 
end

Does it not contradict the premise of avoiding fields with abstract types? That could be avoided if MyType is defined as a parametric type, but then, is this not virtually the same as the second option I described?

On the other hand, the last solution suggested looks rather like the option of creating two different types (and maybe an abstract type that unites them for other “common” functions), although introducing type composition. Does such type composition introduce any particular advantage?

What I wanted to suggest is even simpler.

Just create a Symbol, Enum, UInt8, or Int field in the struct (the type you prefer). There is no need for it to be abstract.

For each distinct constructor, the constructor initializes the field with a different value.

For a method that does not care for how the struct was constructed, then do not look at the field.

For a method that does care for how the struct was constructed, but it is simpler to implement using ‘if’ statements (little difference from one implementation to the other, little to gain by compiling different methods for each one), use if statements checking the value of the field inside the method.

For a method that does care for how the struct was constructed, and gains advantage from a different implementation/compilation for different types, create a outer method that takes the struct, wrap the value of the new field inside a Val(), and call an inner function that takes both the struct and the Val, and has a different body for each different Val value.

2 Likes

Yes, I guess you are right not to trust the newbie here :slight_smile:

This thread touches upon a similar topic (in case you haven’t seen it already):