Style recommendation for enum as type

Mikhail_Kagalenko · September 4, 2020, 9:42pm

Julia documentation states:

If a type is effectively an enumeration, it should be defined as a single (ideally immutable struct or primitive) type, with the enumeration values being instances of it.

Why?
How strong is this recommendation?
Instances can be assigned to any variable, so how do I get the @enum -equivalent functionality, where a specific “keyword” corresponds to a fixed “thing”

Constructors and conversions can check whether values are valid.

An example to clarify what is meant here could be useful

This design is preferred over making the enumeration an abstract type, with the “values” as subtypes.

The latter seems a common pattern discussed on discourse, though. This recommendation seems to conflict particularly with “Holy trait” pattern.

tomerarnon · September 4, 2020, 10:09pm

If you need each type for dispatch, it’s not an enumeration, and the suggestion doesn’t apply. In this case, enumeration refers precisely to the thing that @enum makes. Note that what it does is the same as the manual states: it creates a primitive type (a 32-bit integer look-alike), and then creates instances of it (MyEnum(1), MyEnum(2)…), naming those instances the names you passed to the macro (and declares them const). Although the following is not at all easy to read, it shows this procedure:

julia> using MacroTools

julia> prettify(@macroexpand @enum MyEnum X Y)
:($(Expr(:toplevel, :(begin
      $(Expr(:meta, :doc))
      primitive type MyEnum <: Base.Enums.Enum{Int32} 32 end
  end), :(function MyEnum(rhinoceros::Base.Enums.Integer)
      (0 Base.Enums.:<= rhinoceros Base.Enums.:<= 1) || Base.Enums.enum_argument_error(:MyEnum, rhinoceros)
      return Base.Enums.bitcast(MyEnum, Base.Enums.convert(Int32, rhinoceros))
  end), :((Base.Enums.Enums).namemap(::Base.Enums.Type{MyEnum}) = Dict{Int32,Symbol}(0 => :X,1 => :Y)), :((Base.Enums.Base).typemin(hippopotamus::Base.Enums.Type{MyEnum}) = MyEnum(0)), :((Base.Enums.Base).typemax(snail::Base.Enums.Type{MyEnum}) = MyEnum(1)), :(let koala = (Base.Enums.Any[MyEnum(butterfly) for butterfly = Int32[0, 1]]...,)
      (Base.Enums.Base).instances(::Base.Enums.Type{MyEnum}) = koala
  end), :(const X = MyEnum(0)), :(const Y = MyEnum(1)), :(Base.Enums.nothing))))

The way enumerations are most often used is as comparators. E.g. with X,Y from above,

if a == X
...
elseif a == Y
...

Mikhail_Kagalenko · September 4, 2020, 10:21pm

I see, had a bit of confusion here, where I thought about enum instances as types.

MatthijsCox · December 23, 2020, 1:11pm

Hi I totally recognize this question. I often find myself wondering if I should write an @enum or just make an abstract type with “values” as subtypes. When I want to dispatch on the values, then I will now always write the abstract type myself. And almost always do I find a reason to dispatch on the values. And I often create a mapping to an Integer. But it just feels like an enum.

Here’s some example code that I often find myself writing. I would love some advice/discussion on the best pattern for these things. Maybe it’s just fine. I was totally influenced by this blog about julia dispatching enum versus type.

using Test

module MyModule

    using InteractiveUtils

    export SomeType, First, Second, Third, Fourth

    abstract type SomeType end
    struct First <: SomeType end
    struct Second <: SomeType end
    struct Third <: SomeType end
    struct Fourth <: SomeType end

    Base.Integer(T::Type{<:SomeType}) = Integer(T())
    Base.Integer(::First) = 1
    Base.Integer(::Second) = 2
    Base.Integer(::Third) = 3
    Base.Integer(::Fourth) = 4

    const SomeTypeSet = subtypes(SomeType)

    function Base.convert(::Type{SomeType}, x::Integer)
        for e in SomeTypeSet
            x==Integer(e) && return e()
        end
    end
    SomeType(x::Integer) = convert(SomeType, x)

    Base.isless(x1::SomeType, x2::SomeType) = isless(Integer(x1), Integer(x2))
    Base.isless(x1::Type{<:SomeType}, x2::Type{<:SomeType}) = isless(x1(), x2())

end

@testset "this feels like an enum" begin
    using .MyModule

    # You want to use either the type or the instance of the type as values
    # but you can also use both this way:
    @test First() < Fourth()
    @test First < Fourth
    @test SomeType(3) == Third()
end

If you use these things as a property of a type, then it’s a trait right?
get_trait(::Type{MyFirstType}) = First()

If you use these things as a property of an instance of a type, then it feels like an enum. Like the color of a banana:

abstract type Color end
struct Yellow <: Color end
struct Brown <: Color end

mutable struct Banana
    color::Color
    weight::AbstractFloat
end

eat(b::Banana) = eat(b.color, b)
eat(::Yellow, ::Banana) = println("yummy")
eat(::Brown, ::Banana) = println("eeww")

ericphanson · December 23, 2020, 1:28pm

I think the central question is do you want the value, i.e. the color of the banana in your example, to be in the type domain or not. What are the implications of being in the type domain?

it can be used for dispatch, which in particular means it can sometimes be used more composibly by downstream packages (i.e. you don’t need the “manual dispatch” of a bunch of if-thens to choose what to do based on the value, and other packages can add methods to participate in dispatch)
the compiler will try to specialize on the value, compiling separate methods for f(::Yellow) than f(::Brown)
if the value is not known at compile time, then calls like f(color::Color) can lead to dynamic dispatch, which is generally slow. Though there are various compiler heuristics like small union splitting (i.e. if it knows that the color is either Yellow or Brown, then it can replace the dynamic dispatch with a faster if-then branch) that can help speed things up in certain cases.

I think putting the value in the type domain is something you need to be fairly conscious of and code around, i.e. ensuring the value is known at compile time for hot loops (e.g. via function barriers), or making use of the specialization to speed things up.

Whereas for a value (not in the type domain), the compiler knows it’s a e.g. ColorEnum but nothing else (except when constant propogration applies), and in particular doesn’t compile specialized functions for each of the various colors and there is no issue with dynamic dispatch.

By the way, I have a blog post about the tradeoffs of putting a value in the type domain (in a different context) that could be helpful.

MatthijsCox · December 23, 2020, 2:45pm

Thanks for the advice! I’ve read the blog post and am trying to apply it to my example.

I think in my example the main issue is the Base.convert(::Type{SomeType}, x::Integer) which does not know the values at compile time.

What would be ways around this?

write a hardcoded if-else statement, like x==1 && return First etc
write a macro to generate this statement
hardcode the SomeTypeSet better somehow?
any other way?

I guess the downside is that I then lose all extendibility by external packages for this function?