Unpack types in Union

I would like to extract a Vector{DataType} from a Union. I found the solution shown below, but it looks weird to me, and non-idiomatic.
Is there a better way to unpack types in a Union?

My solution:

unpack_union(tt::Core.TypeofBottom) = DataType[]
unpack_union(tt::DataType) = [tt]
unpack_union(tt::Union) = 
    [getfield(tt, :a), unpack_union(getfield(tt, :b))...]

providing the following results:

julia> unpack_union(Union{})
DataType[]

julia> unpack_union(Union{Int})
1-element Vector{DataType}:
 Int64

julia> unpack_union(Union{Int, String})
2-element Vector{DataType}:
 Int64
 String

julia> unpack_union(Union{Int, String, Missing})
3-element Vector{DataType}:
 Missing
 Int64
 String

and why is that you want to do this? making sure it’s not an XY problem…

Sorry, don’t know what is a XY problem…
Still, I think the question worth an answer (even if my problem could be solved differently…).

there are a lot of code smell you can do, doesn’t mean you should

Thank you, I didn’t knew it!

Indeed, I don’t want to. That’s why I asked what is the correct way to do it…

that’s why I asked what do you need this for?

I would like to keep the discussion focus on the topic, so forget about my proposed solution (which I really don’t like…) and think of it as simply:

Is there an idiomatic way to unpack types in a Union ?

Even “no” is an acceptable answer :wink:

the answer is you shouldn’t be doing that, so if it turns out you really need to, please show us the use case so people can help you.

OK, I can live with it. It’s not clear to me why such introspection should be discouraged or forbidden, given all the amazing Julia features (think about macros…)
But this is definitely not the point I’m interested in.

Thank you for your answer.

doesn’t look that horrible to me :slight_smile:
I would maybe hide everything behind another function:

function unpack_union(tt::Union)
    _unpack_union(tt::Core.TypeofBottom) = DataType[]
    _unpack_union(tt::DataType) = [tt]
    _unpack_union(tt::Union) = [getfield(tt, :a), _unpack_union(getfield(tt, :b))...]
    _unpack_union(tt)
end

Well, consider the following:

  • a Union is definitely not a structure, still you can use it in getfield;
  • propertynames() always returns two names (a and b), the second being a nested Union if more than two types ar involved;
  • the property names a and b to access the types in a Union are not documented in the manual (AFAIK).

Anyway good to know that it doesn’t sound horrible!

1 Like

I mean, you’re definitely depending upon internals — Unions can behave surprisingly and sometimes disappear entirely, depending on how they’re being generated. Defining dispatch on ::Union can be tricky (as you’ve found with TypeofBottom and such). You’re definitely well outside of what inference can possibly track, but that’s not necessarily a bad thing.

I’d try to reformulate your problem such that you don’t need to do this, if at all possible.

2 Likes

Sometimes a data-reading library returns arrays of Union{Missing, Float64} or Union{Missing, Float32}. Because some other libraries (packages) I use doesn’t handle missing, I sometimes need to convert the missing values to NaNs.

So, how do you determine the second element of the Union?

function readalldata()
  a = readdata()
  b = replace(a, missing=>eltype_of_data(NaN)) # how to determine the type?
  return b
end

Because in my case there are only two possibilities, I can branch like

eltorg = eltype(a)
elt = if eltorg == Union{Missing,Float64}
        Float64
      elseif eltorg == Union{Missing,Float32}
        Float32
      else
        error("unknown type: $(eltorg)")
      end
b = replace(a, missing => elt(NaN))

Inellegant, but manageable.

Also, I know that the right approach is to ask the package writers to support missing . . .

How about nonmissingtype?

Friendly suggestion that it would probably better to create a new thread, and perhaps linking to this one, instead of posting in one that’s been inactive for over 3 years :slight_smile:

2 Likes

Thanks! But, it’s curious how the function is implemented. See below.

That depends on the subject of the new thread you are suggesting. If you suggest starting a thread about getting the other part of Union{Missing, Sometype}, then nonmissingtype is the solution, but this thread is more general: How to deconstruct a Union, which hasn’t gotten a clean and idiomatic answer.

I just continued this thread to provide an example where such a functionality is useful, because the original poster was blamed of not providing a use case.

So, to continue, nonmissingtype is implemented like this

nonmissingtype(::Type{T}) where {T} = typesplit(T, Missing)

But, I’m not able to find how typesplit() is defined. (I don’t know github well enough.) I tested it a bit and found that it works on any Union. It acts like subtracting a type from the Union:

Base.typesplit(Union{S,T,U}, T) == Union{S,U}

Inside the function there must be an iteration to go over S, T, and U one by one. Does this iteration uses the :a and :b trick discussed above?

julia> methods(Base.typesplit)
# 1 method for generic function "typesplit" from Base:
 [1] typesplit(a, b)
     @ promotion.jl:147
1 Like