Adding Type to (working) Union{...} bricks it

I’ve got a type defined as Union{Dict{Union{UInt8, String}, Number}, Nothing}. This works, but as soon as I add String to this Union it breaks and does not accept the initially working values anymore. See MWE below.

The error is:

ERROR: MethodError: Cannot `convert` an object of type 
  Dict{UInt8, Float64} to an object of type        
  Union{Dict{Union{UInt8, String}, Number}, String}

(respectively [...] Dict{String, Float64} to an object of type [...] for the Dict using String keys in the MWE)

Why does this work the way it does? And what’s the best solution to achieve that?


Base.@kwdef struct MyStruct1
    val::Union{Dict{Union{UInt8, String}, Number}, Nothing} = nothing
end

Base.@kwdef struct MyStruct2
    val::Union{Dict{Union{UInt8, String}, Number}, String, Nothing} = nothing
end

x = Dict(0x1 => 1.0, 0x2 => 1.0)
y = Dict("1" => 1.0, "2" => 1.0)

MyStruct1()     # MyStruct1(nothing)
MyStruct1(x)    # MyStruct1(Dict{Union{UInt8, String}, Number}(0x02 => 1.0, 0x01 => 1.0))
MyStruct1(y)    # MyStruct1(Dict{Union{UInt8, String}, Number}("1" => 1.0, "2" => 1.0))

MyStruct2()     # MyStruct2(nothing)
MyStruct2(x)    # this fails
MyStruct2(y)    # this fails

Remark: This stays the same if changing Number to Float64 explicitly.

Edit: It works when falling back down to Union{Dict, _String, Nothing}, but I actually need the correct type, that the Dict values are Number for some internal conversion to work correctly (so it’s not only about fully specifying the type, it actually breaks functionality).

So I found out more, and kind of think I understand why this fails:

Dict{String, Number} <: Union{Dict, String}     # true
Dict{String, Number} <: Dict{String, Number}    # true
String <: Any                                   # true
Dict{String, Number} <: Dict{Any, Number}       # false

# therefore this fails too
Dict{String, Number} <: Dict{Union{Number, String}, Number}

While I could understand why this is probably intended behaviour (?), I’m still unsure how to then do what I am trying to do!?


Edit: It still feels counterintuitive that this happens:

Dict{String, Number} <: Dict                    # true
Dict{String, Number} <: Dict{Any, Any}          # false

Edit #2: Using this, one can construct a (partial) workaround… But that feels kind of stupid:

Base.@kwdef struct MyStruct3
    val::Union{Dict, String, Nothing} = nothing
end
MyStruct3(val) = isa(val, Dict) ? MyStruct3(convert(Dict{Union{UInt8, String}, Number}, x)) : MyStruct3(x)

MyStruct3(x)    # MyStruct3(Dict{UInt8, Float64}(0x02 => 1.0, 0x01 => 1.0)) 
MyStruct3(y)    # MyStruct3(Dict("1" => 1.0, "2" => 1.0))
Dict{String, Number} <: Dict{Any, Any}          # false

is intended and expected. The relevant passage from the manuel is I believe

https://docs.julialang.org/en/v1/manual/types/#man-parametric-composite-types

In other words, in the parlance of type theory, Julia’s type parameters are invariant, rather than being covariant (or even contravariant). This is for practical reasons: while any instance of Point{Float64} may conceptually be like an instance of Point{Real} as well, the two types have different representations in memory:

Explicitely allow for subtypes and it works

julia> Base.@kwdef struct MyStruct3
           val::Union{Dict{<:Union{UInt8, String}, <:Number}, String, Nothing} = nothing
       end
MyStruct3

julia> MyStruct3(x)
MyStruct3(Dict{UInt8, Float64}(0x02 => 1.0, 0x01 => 1.0))

julia> MyStruct3(y)
MyStruct3(Dict("1" => 1.0, "2" => 1.0))

julia> MyStruct3(nothing)
MyStruct3(nothing)

julia> MyStruct3()
MyStruct3(nothing)
2 Likes

Tangent:
Are you sure you need a Union{T,Nothing} for some container type T? Nothing can be useful to Union with scalar values, but for containers such as Dict, String, Array, etc one can often avoid Nothing and simply insert an empty container. The only reason this wouldn’t work is if an empty container represented a different semantic than what you’re trying to represent with Nothing.

1 Like

Thank you @skleinbo for explaining that (and pointing out the passage in the manual, that I did not find), now it makes more sense! However, I still wonder why

Dict <: Dict{Any, Any}    # false

since I am unsure what kind of type Dict actually is, if not a short form of Dict{Any, Any}; is there some “more any”-type than Any?


What I now do not understand however, is why MyStruct1 works, but MyStruct2 does not? Shouldn’t MyStruct1 then fail with the same reasoning?

And I do have an additional question, stemming from the fact that my MWE did in fact (stupid me, sorry!) not capture everything… :frowning:

My data is comes from parsing an input/configuration file (using YAML). Since I can not specify that further, I am reading that using YAML.load_file(filename, dicttype=Dict{String, Any}). Now your proposed solution solves my MWE, but breaks the real code in another way: Dict{String, Any} does obviously not work with the subtyping of Dict{<:Union{UInt8, String}, <:Number}, since (Any <: Number) === false.

What I was previously doing, was relying on a working conversion from Any to Number, but that can now not be triggered due to the (still occuring) Cannot `convert` an object of type error.

Why Any / Number? Because the overall file is mixed with some values being Strings, some being Numbers, some being Dicts. Therefore I need to (globally) read the file with Any - but i can guarantee that the parts of the then parsed dictionary that I am using to init a specific struct do actually have the correct type. (yes, that means I could potentially do the conversion to a correct subtype manually as soon as I now the exact type, but that’s not a valid way due to other implications)


@mikmoore: Yea that’s a good question. Unfortunately I actually need the Union{T, Nothing} in this case, due to two reasons:

  1. nothing, in this case, indicates the user not specifying the respective property at all in a config file, while Dict() would indicate the user specifying it but not further configuring it (there are some cases where that is to be allowed)
  2. But more importantly, I would assume (see the code example below why this is just an assumption…), that the Dict (even when empty) uses more memory. With a few millions of these objects, with each struct containing multiple “could-be-unused” fields, that can be of relevance.

Code for checking allocations (I am unsure why the results differ the way they do…):

Base.@kwdef struct MyStruct5
    val::Union{Dict, Nothing} = nothing
end

Base.@kwdef struct MyStruct6
    val::Dict = Dict()
end

function check_alloc()
    println(Base.summarysize(MyStruct5()))      # 8
    println(@allocated MyStruct5())             # 0
    println(varinfo(r"MyStruct5()"))            # 240 bytes

    println(Base.summarysize(MyStruct6()))      # 464
    println(@allocated MyStruct6())             # 512
    println(varinfo(r"MyStruct6()"))            # 224 bytes
end

check_alloc()

Dict{Any,Any} <: Dict, not the other way around.
Dict == Dict{<:Any,<:Any}, however.

I’ll remark that by the time you have a broad abstract type like Number in a parameter, there’s usually no penalty to just using Any.

4 Likes

As @milkmoore pointed out, this means you have not really understood the concept yet. The Dict{Any, Any} is a single concrete type, with a clear memory layout, in which each key and each value will need to be a pointer + tag indicating object type. The Dict{<:Any, <:Any} (or just Dict for short) is an UnionAll type which can be seen not as a single type but a collection of many distinct types, you cannot instance an object of type Dict{<:Any, <:Any} the same way you cannot instance an object of type Number, both are abstract/categories/placeholders that other types can or not match, but they do not have an implementation/‘memory layout’ to be instanced. For example, Dict{Int, Int} is a more specific type with a clear memory layout (that do not need pointers and type tags) and it is a subtype of Dict{<:Any, <:Any}, therefore, it can be used anywhere (function parameter, or struct field) the definition restricts the type to Dict{<:Any, <:Any} or Dict{<:Number, <:Number}.

3 Likes

Julia treats the possible absence of a value (nothing) as a special case. Calling convert on the abstract type Union{Nothing, T} basically strips the Nothing. From the source:

For MyStruct1 we have T = Dict{Union{UInt8, String}, Number} for which a conversion is possible. In the case of MyStruct2, T = Union{Dict{Union{UInt8, String}, Number}, String} for which no conversion exists. In particular because it is an abstract type.

So in short: Union{Nothing, ConcreteType} is handled conveniently.

1 Like

Thanks for pointing that out, that solved my confusion (regarding the inital problem).


Sorry, to ask another question, please feel free to drop the discussion (@all)!
Question: Is the below summarized result (the last 2 sentences on the end) correct?


So it was still not really clear for me, why all of your answers should mean that Dict{Any,Any} <: Dict holds true, while Dict <: Dict{Any,Any} does not. This confusion was largely based on “trying it out” instead of thinking it through, which looked like:

julia> Dict()
Dict{Any, Any}()

julia> Dict{Any, Any}()
Dict{Any, Any}()

julia> Dict{<:Any, <:Any}()
Dict{Any, Any}()

And since

Any is the union of all types. It has the defining property isa(x, Any) == true for any x . Any therefore describes the entire universe of possible values.

this lead me into a wrong direction of thoughts. Now I’ve tried wrapping my head around why Dict{Any, Any} should be a “more concrete” type than Dict{<:Any, <:Any} (since Any being the union of all possible types, would indicate that every type is a subtype of Any and therefore both Dicts would need to support the same structure).

But what I’ve missed is that Dict() without a type can be something different. While

julia> d = Dict();

julia> d[1] = 1;

julia> d
Dict{Any, Any} with 1 entry:
  1 => 1

still results in a Dict{Any, Any}, doing

julia> Dict(1 => 1)
Dict{Int64, Int64} with 1 entry:
  1 => 1

now results in a Dict where Dict{Int64, Int64} <: Dict{Any, Any} does not hold (due to stated reasons). And now it makes sense why not Dict <: Dict{Any, Any}, while Dict <: Dict{<:Any, <:Any}.