Sum types in Julia

Sum types are great invention that simplify some tasks tremendously - i. e. for handling invalid values, parsing dsl or any data with weird shape in general.

I think this article will give a better explanation.

Have Julia authors thought about adding this feature into the language.
Have they ever worked in a language that provides them?
What is your opinion?

http://julia.readthedocs.io/en/latest/manual/types/#type-unions

1 Like

Type union is not a sum type although they look similar.

Maybe you can tell us more about what features of sum types you are particularly interested in? Work is currently going on to optimize storage and dispatch of Union types. What would you need more?

3 Likes

Agreed. I read the description, but it still looks like programming mumbo-jumbo to me. But if you give a super slick example (preferably related to scientific computing :slight_smile:) then I’ll understand it and probably like it.

A type union in Julia is exactly a sum type, according to my understanding. (However, Julia doesn’t have the nice pattern-matching syntax for working with algebraic datatypes that you have in ML etc.)

1 Like

What about tagged unions?

Julia has another kind of “pattern matching” :slight_smile:

struct ClickEvent
    x::Int
    y::Int
end

struct PaintEvent
    c::Color
end

const Event = Union{ClickEvent, PaintEvent}

handle_event(ev::ClickEvent) = ...
handle_event(ev::PaintEvent) = ...
handle_event(ev) = error("No such event")

The only real difference I can see is that Julia unions aren’t closed and not checked during compilation, i.e. there’s no way to prevent a user from creating more event types.

4 Likes

A sum type doesn’t seem to be a type union - and it is a feature that would be very useful in Julia. It has been discussed before for Julia, i.e. called tagged unions, discriminated unions, or tagged variants.
If you have a type union in Julia, in means you have to have a pointer to a boxed object that has it’s type information,
which is not the same as being able to have multiple types sharing the same memory, distinguished by a tag.

That seems like a question of implementation, not semantics. (Do sum types in Haskell or ML make any guarantees about memory representation?)

(In fact, there are ongoing efforts to improve the efficiency of unions of small numbers of types in Julia, driven in part by things like nullable types. It may well be that in a future version of Julia something like Union{Int,Float64,Void} will be stored inline (e.g. in an array or a mutable) with a type tag, rather than as a pointer to a heap object.)

For example, see: https://github.com/JuliaLang/julia/pull/20593

Not really - when talking about tagged unions, the definition is basically that the memory is shared.

“The primary advantage of a tagged union over a simple record containing a field for each type is that it saves storage by overlapping storage for all the types. Some implementations reserve enough storage for the largest type, while others dynamically adjust the size of a tagged union value as needed. When the value is immutable, it is simple to allocate just as much storage as is needed.”

There is a semantic difference between sum types and union types, but it is subtle. The difference is that Sum{Foo, Bar} contains information about whether it is a left object of type Foo, or a right object of type Bar. This difference is important when the intersection of Foo and Bar is non-empty.

For instance, in Haskell:

GHCi, version 7.10.3: http://www.haskell.org/ghc/  :? for help
Prelude> let foo 0 = Left (); foo n = Right ()
Prelude> let bar (Left _) = "left"; bar (Right _) = "right"
Prelude> foo 0
Left ()
Prelude> foo 1
Right ()
Prelude> bar (foo 0)
"left"
Prelude> bar (foo 1)
"right"

whereas a union type loses the left/right information.

From Wikipedia,

In set theory the equivalent of a sum type is a disjoint union, a set which elements are pairs consisting of a tag (equivalent to a constructor) and an object of a type corresponding to the tag (equivalent to the constructor arguments).

However, sum types are easily implemented in a package. Here are some sample implementations.

struct Left{T}
    val::T
end

struct Right{T}
    val::T
end

const Sum{T, U} = Union{Left{T}, Right{U}}

or with a backing Union:

struct Sum{T, U}
    isleft::Bool
    val::Union{T, U}
end

or with Nullable:

struct Sum{T, U}
    left::Nullable{T}
    right::Nullable{U}
end

or with memory being shared, for primitive types of the same size:

struct IsBitsSum{T, U}
    isleft::Bool
    bits::T
end

unsafe_left(x::IsBitsSum) = x.bits
unsafe_right(x::IsBitsSum{<:Any, U}) where U = reinterpret(U, x.bits)
left(x::IsBitsSum) = (@argcheck x.isleft; unsafe_left(x))
right(x::IsBitsSum) = (@argcheck !x.isleft; unsafe_right(x))
get(x::IsBitsSum) = x.isleft ? unsafe_left(x) : unsafe_right(x)
5 Likes

Tagged pointers I think still would need low-level system support, and this is something very commonly used in C programming (Julia itself uses it internally).
There would need to be some way of telling julia the mapping between the tag (which might be encoded in the bottom bits of the pointer), and what type the rest of the value is (it might be an integer, or a 61 bit floating point value, or different types of pointers), and have julia be able to know how to follow pointers during marking for GC.

isn’t this the same situation as Nullable? with unions, this is ambiguous, and in fact we have functions for A, B and C:

T1 = Union{A, B}
T2 = Union{A, C}

something(t1::T1) = ... # t1 will be either A or B
something(t2::T2) = ... # t2 will be either A or C

however, with an imaginary sum type we would have

T1 = Sum{A, B}
T2 = Sum{A, C}

something(t1::T1) = ... # t1 will be Sum{A, B}, and can be queried which value is set
something(t2::T2) = ... # t2 will be Sum{A, C}

so there is semantic difference. note that i’m not arguing that Julia needs a sum type. just trying to see what it gives.

I played with explicitly tagged types here: https://github.com/andyferris/Switches.jl

However, I stopped developing it once I realized @jameson was keen to make (implicitly tagged, by the compiler) Union types work faster…