Small union types: present and future (looking for some guidelines)

A bit of context: I am in the process of writing a micro-optimized parser for some ASCII data (a lot of it). Parse functions can return a value, or error information (“parsing failed at character …, for reasons …”). So it is like a Nullable, but the “null” case has more information than a Boolean (an Int does fine in practice).

I first implemented it like a Nullable with extra info,

struct MaybeParsed{S,T}
    MaybeParsed{T}(info::S) where {T,S} = new{T,S}(info)
    MaybeParsed(info::S, value::T) where {T,S} = new{T,S}(info, value)

When combining these parsers (eg parse multiple fields in a line) I ran into the predictable nightmare of calculating the return type when there is no value.

Then I switched to the following: for errors, return a ParserError(::S) object, otherwise the value. S is a bits type (again, Int for most of the time). In 0.6, the speed is only nominally affected (about 5%). In master, it is faster.

So I am happy, but would like to design my code in a way that it would be performant in 0.7. I am of course OK with sacrificing a bit of speed in 0.6, and do refactoring/rebenchmarking if necessary from time to time — I recognize that 0.7 and especially 1.0 are in flux at the moment.

However, it would be great to receive some preliminary guidance about what the compiler can deal with (and still emit optimized code) in for 0.6 and 0.7 when it comes to Union{T1, T2, ..., Tn} return types. I did my homework, and read through a lot of github discussions, but I am still confused, and finding things out empirically. Particularly,

  1. what are the restrictions on n? I found that even n=3 works occasionally, n=2, reliably.

  2. what are the restrictions on the Ts? They would of course have to be concrete types, but do they have to be bits types? Any other restrictions?

  3. The above questions are about return types, where the caller resolves the uncertainty immediately (maybe that’s why it is fast for me?) My impression is that Union types as fields or eltypes are still not recommended. Is this correct?


Technically, the limit is typemax(Int8), but the compiler has trouble w/ “exotic” Union types; I’d say anything over 6-8, you’ll likely run into compiler issues. And what I’m talking about here is any kind of Union types, concrete, abstract, whatever.

There are no restrictions on Ts generally; unless you’re looking for the current memory optimizations available in 0.7. The memory optimizations ares:

  1. Field types of “isbits-Union” are stored inline, with extra “selector” byte to indicate which type is stored in the actual field slot. This works by reserving a field slot big enough to hold the largest Union element type; so a field declared w/ a type of Union{Void, Int8, Int64} is going to reserve 9 bytes in the struct: the first 8 in order to hold a potential Int64, and the last byte to store a Int8 value of 0, 1, or 2, depending on if the actual value of the field is Void, Int8, or Int64, respectively.
  2. This same type of optimization has been implemented for Arrays of isbits-Union eltypes. For example, in 0.6, a Vector{Union{Void, Float32}}(10) allocated 80 bytes for elements, because Union eltypes were stored as pointers to actual values. In 0.7, that same array now allocates only 40 bytes, half as many, because the elements are now stored inline like other isbits types.

Yes, currently there is an open issue about performance issues w/ inferred Union return types. It currently boils down to the fact that when a Union type is inferred, codegen then tries to speculatively branch on the specific Union types. The problem is that the branches are inserted correctly, but within the branches, nothing ever gets inlined. This ends up causing severe performance problems if this situation happens in a hot inner loop. But as you’ve discovered, there’s a current workaround where the user has explicit code to branch on the return type anyway! In that case, inference doesn’t have to generate the Union branches, it can use them and inlines happily.

Hope that helps explain things! Let me know if anything didn’t make sense.


I have a related question:

f1(x) = x > 3 ? (Nullable(1),) : (Nullable(2.0),)
f2(x) = x > 3 ? (1,) : (2.0,)
f3(x) = x > 3 ? Nullable(1) : Nullable(2.0)

the return types are, respectively, Tuple{Nullable}, Tuple{Real}, and Union{Nullable{Float64}, Nullable{Int64}} (on both 0.6 and 0.7). What are the rules to decide when to keep the union, and when to merge them into one abstract type? Should I try to keep the union (eg. by wrapping one of the two values)?

Incidentally, this might be a regression: if I write a dumb loop summing the numbers over x=10:20, I find that f2 doesn’t allocate, f3 allocates on 0.6 but not on 0.7 (:ok_hand:), but f1 allocates 40% more on 0.7 than on 0.6

function foo()
    su = 0
    for x in 10:20
        su += get(f1(x)[1])
@allocated foo()

I do have a more general design question which came up thinking about a PR for date validation and parsers in general.

When designing an API which

  1. parses and validates another representation of a type T (eg "123" to 123::Int),
  2. with the option of diagnosing invalid values without try ... catch,

the standard approach up to v0.6 has been something like tryparse(T, ...), which would return a Nullable{T}. When isnull, the caller would know that something went wrong, but nothing more specific.

Now with small union types, the following makes sense:

  1. define some type E <: Exception depending on T (possibly a narrower abstract type should be created for this, <:ParsingError{T}?), which contains detailed diagnostic information on what went wrong (eg “the 3rd character of what was supposed to be an Int is not a digit”),

  2. have tryparse(T, ...) and similar return a type of Union{T, E}.

This would be basically costless in v0.7 when the caller checks the result. parse(T, ...) could just do it and when the result isa E, throw it as is.