Using types to enumerate

I know there’s been a few different discussions regarding the use of types themselves as enums. But I was reading through the style guide and found an interesting tidbit that goes against my intuition:

If a type is effectively an enumeration, it should be defined as a single (ideally immutable struct or primitive) type, with the enumeration values being instances of it. Constructors and conversions can check whether values are valid. This design is preferred over making the enumeration an abstract type, with the “values” as subtypes.

I would have thought the latter (discouraged) convention would be more ideal, as it could leverage built-in type dispatch (and thus, you can directly dispatch on different enum values with different subtypes)?

How would one cleanly dispatch when the enumeration values are just different instances of a type? More importantly, why is this preferred?

Thanks!

because you don’t want to stress inference engine / compiler. Consier this, if you make every single different values of Int16 their own type, sure a lot of things can happen during compile time, but your application then is dominated by compile time …

3 Likes

It also depends a lot on whether the enumeration value is typically known statically (at compile time) or at runtime in your application, and whether there is much payoff to exploiting this.

For example, the iteration interface heavily uses singleton types as a kind of static enum. e.g. the function Base.IteratorEltype(iterator) returns singleton instances EltypeUnknown() and HasEltype(). (This is sometimes called the “Holy traits pattern”.) The reason for this is twofold:

  1. The return value (hence type) of IteratorEltype can be determined statically by the compiler, assuming the type of iterator is known.
  2. There is a significant payoff because compiling loops is critical to Julia’s performance, and knowing things about the iterator statically allows the compiler to inline a specialized version of the loop, as well as propagate things like eltype(iterator) information “downstream” to subsquent code.

On the other hand, if the enum value is runtime information (e.g. if you can imagine commonly having an array of your enums), putting different values as distinct types creates type instability and forces the compiler to use runtime dispatch, which is bad.

2 Likes

Thanks @stevengj!

putting different values as distinct types creates type instability and forces the compiler to use runtime dispatch, which is bad.

Is this runtime/dynamic dispatch notably slower than if I were to “manually” dispatch during runtime (using a bunch of if-statements checking the value of my enum and calling an appropriate function)?

Both have to evaluate at runtime, but at least with the typed version of the code, I can maintain the advantages of composability offered by multiple dispatch (whereas with my if-statement tree, maintaining/extending the code is often a headache).

Probably it is slower, yes. The basic reason is that Julia’s multiple dispatch rules are much more complicated in general than checking a few integer values, and the compiler cannot generally simplify the former to the latter.

2 Likes

You can check this for yourself:

julia> using BenchmarkTools

julia> f(x) = x%3==0 ? 0 :
              x%3==1 ? x+1 :
              x÷2
f (generic function with 1 method)

julia> g(x) = g(Val(x%3), x)
       g(::Val{0}, x) = 0
       g(::Val{1}, x) = x+1
       g(::Val, x) = x÷2
g (generic function with 4 methods)

julia> const x=5
       y::Int=5
       z=5
5

julia> @btime f(x) # const-prop
       @btime g(x) # const-prop
       @btime f(y) # type-stable, runtime branching
       @btime g(y) # type-stable, runtime dispatch
       @btime f(z) # type-unstable, runtime branching
       @btime g(z) # type-unstable, runtime dispatch
  1.000 ns (0 allocations: 0 bytes)
  1.000 ns (0 allocations: 0 bytes)
  3.100 ns (0 allocations: 0 bytes)
  54.675 ns (0 allocations: 0 bytes)
  15.230 ns (0 allocations: 0 bytes)
  69.775 ns (0 allocations: 0 bytes)
2

When calling the @btime macro, you can also “interpolate” the value with $, which stabilizes the type but loses constant propagation:

julia> @btime f($x)
       @btime g($x)
       @btime f($y)
       @btime g($y)
       @btime f($z)
       @btime g($z)
  2.900 ns (0 allocations: 0 bytes)
  54.213 ns (0 allocations: 0 bytes)
  3.300 ns (0 allocations: 0 bytes)
  54.213 ns (0 allocations: 0 bytes)
  3.100 ns (0 allocations: 0 bytes)
  54.213 ns (0 allocations: 0 bytes)
2

(behavior since this pr)