Why can't Enums be #undef?

When trying to initialize Enum as #undef it instead goes into an invalid state by adopting an arbitrary integer.

Consider the following example:

julia> @enum Fruit Apple Banana Peach

julia> struct Basket{T}
           fruit::T
           
           function Basket{T}() where {T}
               return new{T}()
           end
       end

julia> fruit_basket = Basket{Fruit}()
Basket{Fruit}(Main.<invalid #1804537424>)

When initializing an empty struct like this, I would assume its members to be #undef or 0 until I set them. This works totally fine with other datatypes:

julia> float_basket = Basket{Float64}()
Basket{Float64}(0.0)

julia> string_basket = Basket{String}()
Basket{String}(#undef)

Integers are a bit of special case:

julia> int_basket = Basket{Int64}()
Basket{Int64}(140052098128304)

I guess Enums will be initialized the same way as integers. Therefore, they are assigned some random integer which in most cases will be invalid.

So, my first question would be:
Why aren’t Enums just set to 0, i.e. their first value, by default?

But, there is more to my problem.
With an Enum like above, one has to be careful to not use it before assigning a proper value to it.

However, when trying to use Distribtued, the invalid Enum just breaks everything.
We set up the following example to demonstrate the issue.

julia> using Distributed

julia> addprocs(2)
2-element Vector{Int64}:
 2
 3

julia> @everywhere @enum Fruit Apple Banana Peach

julia> @everywhere mutable struct Basket{T}
           fruit::T
           
           function Basket{T}() where {T}
               return new{T}()
           end
       end

julia> for i in 1:5
           Basket{Fruit}()
       end

julia> @distributed vcat for j in 1:5
           Basket{Fruit}()
       end
ERROR: TaskFailedException
Stacktrace:
 [1] wait
   @ ./task.jl:334 [inlined]
 [2] fetch
   @ ./task.jl:349 [inlined]
 [3] preduce(reducer::Function, f::Function, R::UnitRange{Int64})
   @ Distributed /opt/julia-1.7.2/share/julia/stdlib/v1.7/Distributed/src/macros.jl:274
 [4] top-level scope
   @ /opt/julia-1.7.2/share/julia/stdlib/v1.7/Distributed/src/macros.jl:286

    nested task error: ArgumentError: invalid value for Enum Fruit: 1273609008
    Stacktrace:
      [1] enum_argument_error(typename::Symbol, x::Int32)
        @ Base.Enums ./Enums.jl:85
      [2] Fruit
        @ ./Enums.jl:198 [inlined]
      [3] read(io::Sockets.TCPSocket, #unused#::Type{Fruit})
        @ Base.Enums ./Enums.jl:22
      [4] deserialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, t::DataType)
        @ Serialization /opt/julia-1.7.2/share/julia/stdlib/v1.7/Serialization/src/Serialization.jl:1428
      [5] handle_deserialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, b::Int32)
        @ Serialization /opt/julia-1.7.2/share/julia/stdlib/v1.7/Serialization/src/Serialization.jl:865
      [6] deserialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, t::DataType)
        @ Serialization /opt/julia-1.7.2/share/julia/stdlib/v1.7/Serialization/src/Serialization.jl:1435
      [7] handle_deserialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, b::Int32)
        @ Serialization /opt/julia-1.7.2/share/julia/stdlib/v1.7/Serialization/src/Serialization.jl:870
      [8] deserialize_fillarray!(A::Vector{Basket{Fruit}}, s::Distributed.ClusterSerializer{Sockets.TCPSocket})
        @ Serialization /opt/julia-1.7.2/share/julia/stdlib/v1.7/Serialization/src/Serialization.jl:1230
      [9] deserialize_array(s::Distributed.ClusterSerializer{Sockets.TCPSocket})
        @ Serialization /opt/julia-1.7.2/share/julia/stdlib/v1.7/Serialization/src/Serialization.jl:1222
     [10] handle_deserialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, b::Int32)
        @ Serialization /opt/julia-1.7.2/share/julia/stdlib/v1.7/Serialization/src/Serialization.jl:852
     [11] deserialize
        @ /opt/julia-1.7.2/share/julia/stdlib/v1.7/Serialization/src/Serialization.jl:801 [inlined]
     [12] deserialize_msg(s::Distributed.ClusterSerializer{Sockets.TCPSocket})
        @ Distributed /opt/julia-1.7.2/share/julia/stdlib/v1.7/Distributed/src/messages.jl:87
     [13] #invokelatest#2
        @ ./essentials.jl:716 [inlined]
     [14] invokelatest
        @ ./essentials.jl:714 [inlined]
     [15] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
        @ Distributed /opt/julia-1.7.2/share/julia/stdlib/v1.7/Distributed/src/process_messages.jl:169
     [16] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
        @ Distributed /opt/julia-1.7.2/share/julia/stdlib/v1.7/Distributed/src/process_messages.jl:126
     [17] (::Distributed.var"#99#100"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
        @ Distributed ./task.jl:423
    Stacktrace:
     [1] remotecall_fetch(::Function, ::Distributed.Worker, ::Function, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
       @ Distributed /opt/julia-1.7.2/share/julia/stdlib/v1.7/Distributed/src/remotecall.jl:469
     [2] remotecall_fetch
       @ /opt/julia-1.7.2/share/julia/stdlib/v1.7/Distributed/src/remotecall.jl:461 [inlined]
     [3] #remotecall_fetch#158
       @ /opt/julia-1.7.2/share/julia/stdlib/v1.7/Distributed/src/remotecall.jl:496 [inlined]
     [4] remotecall_fetch
       @ /opt/julia-1.7.2/share/julia/stdlib/v1.7/Distributed/src/remotecall.jl:496 [inlined]
     [5] (::Distributed.var"#169#170"{typeof(vcat), var"#1#2", UnitRange{Int64}, Vector{UnitRange{Int64}}, Int64, Int64})()
       @ Distributed /opt/julia-1.7.2/share/julia/stdlib/v1.7/Distributed/src/macros.jl:270

julia>

The behavior is the same in Julia versions 1.7.2, 1.5.4 and 1.3.1.

Although this error makes sense when you know about the way Enums are assigned arbitrary values and the way data is transferred between distributed workers. However, in large project this is incredibly hard to track down.
Therefore, I once more want to ask: why aren’t Enums 0 by default, the same way Float64 are? Or, even better, why don’t they support #undef?

no, they are not assigned, they simply pick up whatever bits happen to be in your RAM at that location, it’s indistinguishable.

Because Enum is just bits, Julia chooses to display them as Number, instead of showing you #undef.

julia> a = Vector{UInt8}(undef, 2)
2-element Vector{UInt8}:
 0x00
 0xb2

julia> a = Vector{Ref{Int}}(undef, 2)
2-element Vector{Ref{Int64}}:
 #undef
 #undef

this is just a fluke I believe

Also, @enum allows to start from a number distinct from zero, so zero is not a safe default for every enum.

julia> @enum MyEnum A=1 B C

julia> A
A::MyEnum = 1

julia> B
B::MyEnum = 2

You probably should not be leaving them undefined them. As @jling pointed out, you Float64 example has no guarantee that the value will be initialized with a zero, in my machine:

julia> v = Vector{Float64}(undef, 1000);

julia> maximum(v)
NaN

julia> v = Vector{Float64}(undef, 100);

julia> maximum(v)
0.0

So you may have even more problems.

I recommend either changing the field to be a union between the enum type and Nothing and initializing with nothing. Or creating a new enum value called UNINTIALIZED or something like that and initializing the struct with it.

Interesting, I tried this a couple of times, and got zero everytime. Got lucky, I guess.

That’s a very good point. Wouldn’t make sense then to give Enums default values.

We’re changing the way we use/initialize our parametric types now, so everything will always have a value that makes sense.

Still, I’m wondering if the error message that Distribtued / Serialization throws when trying to pass through an invalid Enum good be changed to debug this issue quicker. I don’t have a good idea where and what I would change, but this seems to be a tricky pitfall when working with serialized messages.

1 Like

It’s odd to me that in the first example, it was happy to pass along a Basket{Fruit}(Main.<invalid #1804537424>) while in the Distributed example, it throws an error. FWIW, this error does not get thrown for me when I’m using Julia v1.7.2 macOS ARM, it just gives me a vector of enums with invalid initializations. I only hit the error on my other machine using Julia v1.7.2 macOS x86.

So I guess to me the important question is “What is the expected behavior for initializing Enums with invalid integers?” I’d be fine with either the error or the Main.<invalid#...>, but it seems weird that it could go either way.

The error that gets thrown when using Distributed is thrown by deserialize. So it’s not actually an issue of Enums being initialized with invalid integers, but Serialization not accepting invalid integers when deserializing data. Probably because it does something like this

julia> @enum Fruit Apple Banana Peach

julia> Fruit(1234)
ERROR: ArgumentError: invalid value for Enum Fruit: 1234
Stacktrace:
 [1] enum_argument_error(typename::Symbol, x::Int64)
   @ Base.Enums ./Enums.jl:85
 [2] Fruit(x::Int64)
   @ Main ./Enums.jl:198
 [3] top-level scope
   @ REPL[4]:1
3 Likes

You could use LazilyInitializedFields.jl for your uninitialized fields.

1 Like