What is the Julia equivalent of Rust Option<T>?

world-peace · November 1, 2024, 1:55pm

I read a few times about different ways in which Julia can represent optional wrapped types for return values.

I am fairly confident I have seen this suggested before

return type: Union{T, Nothing}
# usage:
return nothing
return t # value

Alternatively, this seems like it would also make sense

return type: Union{T, Missing}
return missing
return t # value

Is there a reason to prefer one of these over the other, or better yet, is there a canonical way of implementing optional return values in Julia, similar to Rust’s std::option wrapper type

# Rust
std::option<T>
# examples
match(T) {
    None => { ... }
    Some(t) => { println!(t) }
}

if let Some(t) = optional_t ...

if !t.is_none() ...

sgaure · November 1, 2024, 2:03pm

Julia does not have algebraic types like Rust’s enum.

The Missing and Nothing types are essentially the same. I.e. they are defined as

struct Nothing end
const nothing = Nothing()
struct Missing end
const missing = Missing()

However, Nothing is built in because it’s needed for functions that does not return anything.

The intended use of Missing is for use in datasets where values sometimes are missing. Functions handling such data typically have special handling of Missing. The intended use of Nothing is to return, eh, nothing.

So if you return Union{Missing, T} or Union{Nothing, T} doesn’t matter unless you plan to use some of the automatic Missing handling.

julia> maximum([missing, 12])
missing

julia> maximum([nothing, 12])
ERROR: MethodError: no method matching isless(::Int64, ::Nothing)
The function `isless` exists, but no method is defined for this combination of argument types.
...

There is also a type Some, used to wrap any value, so you can distinguish returning nothing from returning nothing.

Mason · November 1, 2024, 2:06pm

Depends on the use-case, but I would typically use nothing rather than missing (which has some incredibly annoying semantics).

Neither of these are the same as Rust’s Option<T> though, since T <: Union{T, Nothing} whereas T is not a subtype of Option<T> in Rust.

Likely the most rust-like thing currently would be in the package SumTypes.jl.

using SumTypes
@sum_type Option{T} begin
    None
    Some{T}(::T)
end

julia> let x::Option{Int} = None
           @cases x begin
               None => "got a none!"
               Some(t) => "got $t"
           end
       end
"got a none!"

julia> let x::Option{Int} = Some(1)
           @cases x begin
               None => "got a none!"
               Some(t) => "got $t"
           end
       end
"got 1"

world-peace · November 1, 2024, 2:19pm

To clarify the use case for this - if I have a function, which calculates some value, but that return value is optional, what should I use?

I investigated to see if there were any packages available which implemented this as its own type, rather than having to go around violating D.R.Y. by writing

Union{T, Nothing}

in multiple places.

I found these two, but they seem to be deprecated / abandoned. (No recent commits, suggesting they are not being maintained.)

To give some further details about my use case. I can’t provide the exact details, but you could imaging this as an analogy:

There is a function which calculates the standard deviation of some data
This function takes a timeseries (vector of values) as an input
It also takes a “number of days” argument, which can be 0, 1 or larger than 1
If “number of days” is 0 or 1, clearly the standard deviation cannot be calculated
It is not an error if the number of days is set to 0 or 1. This is an acceptable input in some contexts.
I do not want to return NaN, because this could be silently missed. If I return some “optional<T>” thing, then this strongly encourages users of the API to check for none/nothing/missing before using the return value.
Returning a floating point number, which could be NaN does not place the same emphasis on clients using the API.

This is really a “I want to productionize some software to make it more robust” kind of a problem rather than being anything to do with calculating standard deviations. I just give that as an example to have something to visualize.

world-peace · November 1, 2024, 2:24pm

BTW - this is also not a “hey look at this cool feature Rust has why doesn’t Julia have this” kind of a post. (Just in case anyone reads this in future thinking that it might be.)

C++ also has a similar thing.

I raise this because:

Rust does not have exceptions
This makes the std::option and std::result types essential to the language, because these types are how Rust handles errors and nullable values
On the other hand, C++ does have exceptions, but it also has a std::optional<T>
std::optional - cppreference.com
It also has an equivalent for error types, std::expected<T>:
std::expected - cppreference.com
C++ provides these things for performance reasons. It is typically faster to return values than emit and catch exceptions
The same logic does not apply in Python because there is no overhead (at least not notably so) for handling exceptions

^ Just a bit of extra context for those familiar with a range of languages

To summarize, maybe Julia does not have an equivalent of a nullable type or a result (for errors) type.

A nullable type is for handling optional/missing values in the return type rather than using exceptions
A result type is for handling errors in the return value instead of using exceptions

The advantage of these two things is not just performance, but separation of semantics. Using exceptions for missing values as well as errors conflates two ideas into the same machinery.

If Julia doesn’t have either - or if there isn’t good support (via packages) for either - then this is also an acceptable answer. It would mean we should just use exceptions instead.

Mason · November 1, 2024, 2:30pm

If that’s your use-case, and you believe that this is a real problem, I would say you should definitely not be using Union{T, nothing}, because a user can very easily write code that just assumes your function always returns a T and then get suddenly bitten when they eventually pass an input that returns a nothing. The classic example of this is findfirst, people very often just go and stick its output into an indexing expression which is brittle (or at the very least leads to sub-par error messages)

One option if you don’t want to bring in a package and just want Base dependancies would use to use Option{T} = Union{Some{T}, Nothing}.

Some is a wrapper around a result that needs to be consciously unwrapped before it could be used. e.g. one might write

fussy_findfirst(args...) = let res = findfirst(args...)
    if isnothing(res)
        res
    else
        Some(res)
    end
end

and then users would have to write

let res = fussy_findfirst(f, arr)
    if isnothing(res)
        some_informative_error()
    else
        ind = something(res)
        g(arr[ind])
    end
end

Union{Some{T}, Nothing} does not have all the advantages of Rust’s enum types (or SumTypes.jl) but it probably has most of the important ones to get the job done in the case you’re describing here.

sgaure · November 1, 2024, 2:32pm

By returning nothing you force the user to handle it or get an exception. If you return missing, some functions will handle it, maybe by returning missing, without throwing exceptions. So, missing is a sort of software NaN (or NaT, “not a thing”, an actual concept in Itanium CPUs).

world-peace · November 1, 2024, 2:37pm

I’m just wondering if I am being slightly stupid here and asking for a behavior Julia can’t provide me with on that basis that it is a dynamic language rather than being compiled (and in the case of Rust, with a very strong type system).

I suppose it doesn’t matter what the return type is, because I can’t force clients to handle all variants of the returned type(s), because the type system does not enforce this. (It’s dynamic.)

Possibly you’re right and returning a nothing directly is the most sensible approach.

Now I’m wondering to myself what are the consequences for performance and type stability if I do this?

Benny · November 1, 2024, 2:45pm

What matters to me is that Unions aren’t instantiable, but sum types are.

world-peace:

I investigated to see if there were any packages available which implemented this as its own type, rather than having to go around violating D.R.Y. by writing
Union{T, Nothing}
in multiple places.

This is not its own type, unless you really named a type T. Chances are someone loosely used it to describe a return value’s inferred types, or they’re referring to T as a type parameter for a field of a parametric type. In those cases, you want to write out Union{T, Nothing} for clarity. Making an alias MaybeNothing{T} = Union{T, Nothing} would needlessly obscure that it’s a Union, and it’s more worth making the name for a proper sum type.

Many base functions are designed to possibly return nothing so 2 types are well-supported, and there are optimizations for storing isbits Union-annotated fields or Array elements (I think up to 256) and branching over small inferred Union types (3, maybe 4). But you have to keep on top of the code’s type inference to maintain small Union’s, you can imagine that foo(::Maybe2Types, ::Maybe3Types) may compute one of 6 types and lose those optimizations and good type inference. Bear in mind that while Union-based code may have less wrapping and unwrapping, you still have to do checks like isnothing and branch to specific behaviors. While Unions and exceptions are common and SumTypes is more niche, use sum types if you need to handle sum types, the compiler will be happier.

Not throwing is still cheaper in any language. Exception handling is acceptable control flow in CPython mostly because other overheads outweigh the choice of alternative control flow constructs.

sgaure · November 1, 2024, 2:47pm

That is all fine. All iterators does this. I.e. if you write a loop like

for i in 1:10
    ...
end

It’s lowered to

next = iterate(1:10)
while next !== nothing
    (i, state) = next
    # body
    next = iterate(1:10, state)
end

So, iterate() returns nothing when the iterator is finished. As long as it’s a Union of less than 4 types (I think, see Union-splitting: what it is, and why you should care), the compiler handles it with high performance.

And, remember, it’s not possible to return something of type Union{T, Nothing}. That’s not a concrete type, and no objects have this type. However, the compiler uses such unions to describe that a function returns either T or Nothing.

Benny · November 1, 2024, 3:41pm

I found this old blog post that does a better job of differentiating Julia’s Union and Rust’s sum types. The sections on the practical pros and cons for either approach are of particular interest. Link: Union vs sum types

I would guess that the only reason sum types aren’t utilized as much in Julia is that type optimization tends to go to extremes. Ideally you have perfect type inference in a generic method; you might get nothing and missing in some contexts, but those don’t escalate to more types (operations on missing propagate to missing, ones on nothing error) and are quickly handled. People don’t tend to maintain custom small Unions through a method, and why would they when the compiler only tracks a few? When there are more, maybe even infinite, types the compiler cannot generally track, optimization then becomes about isolating the dynamic dispatches to small caller methods, and the JIT-compiled callees can be perfectly inferred again. The middle ground of telling the compiler the finite number of types you handle with more care is more necessary for a AOT-compiled language. AOT-compiled juliac is being developed for its lower overhead, and I predict that sum types will become far more appreciated when the JIT compiler and its flexibility isn’t around.

mikmoore · November 1, 2024, 3:54pm

As others have noted, the compiler does well with small unions already. Further, most functions will break when given nothing and #37866 (now closed as completed) was an issue to track whether that sort of knowledge can be exploited by the compiler to eliminate even these small unions.

aplavin · November 1, 2024, 4:03pm

Regarding performance, non type-stable stuff like missing and nothing can easily be orders of magnitude slower than the corresponding type-stable code. Whenever the type has a natural sentinel value, like NaN for floats, it’s worth using that.

See a simple example of missing being more than two orders of magnitude slower than NaN in current Julia at Is there any reason to use NaN instead of missing? - #4 by aplavin and order of magnitude type-unstable performance regression · Issue #50130 · JuliaLang/julia · GitHub.

world-peace · November 1, 2024, 4:24pm

Experience suggests to me this is very much context dependent.

If you have a hot loop then I agree with you. It doesn’t seem very sensible to slow your code down by a factor of 2 (if it really is that much - it may depend on the exact logic being executed)
Otherwise, if performance isn’t of critical importance (80:20 rule and all that - 80% of runtime is in about 20% of your code) then experience tells me having an API which is harder to use wrong is better

Appreciate if you’re working on some scientific simulation you probably fall into the former category. If you are building (production) systems (this is what I spend most of my time doing) then ensuring other teams use your code correctly is usually the priority.

aplavin · November 1, 2024, 5:22pm

See the example in my previous post, it’s not a factor of 2 – it can be 2.5 orders of magnitude (> 200x slower) even in very simple one-line examples.

world-peace · November 1, 2024, 5:55pm

Sorry I misread your comment - but the above still applies.

Benny · November 1, 2024, 5:59pm

It’s worth pointing out that example works precisely because the small Union isn’t maintained by arrays’ element type inference:

julia> x_missing = [([1., 2., 3.],), ([1., 2., 3., missing],)]
2-element Vector{Tuple{Vector}}:
 ([1.0, 2.0, 3.0],)
 (Union{Missing, Float64}[1.0, 2.0, 3.0, missing],)

and even if you manually narrow the element type to a union of the 1-tuple of those 2 vectors’ types, the output array’s element type can’t be inferred from the 2 tuple types:

julia> map(x -> sum.(x), x_missing)
2-element Vector{Tuple{Any}}:
 (6.0,)
 (missing,)

julia> [(1.0,), (missing,)]
2-element Vector{Tuple{Any}}:
 (1.0,)
 (missing,)

Union-splitting is not at all in play here.

serhii · November 3, 2024, 11:20am

Have you considered using ResultTypes.jl package?

Topic		Replies	Views
Aliases for Union{T, Nothing} and Union{T, Missing}? New to Julia	40	7294	May 10, 2019
Use exceptions vs other patterns? General Usage	22	259	October 3, 2024
Why can't we merge Missing and Nothing? Internals & Design	20	2241	July 4, 2019
Best practices for dealing with `Union{Nothing,...}` New to Julia	16	4596	October 14, 2022
Union Types - Good or Bad? Internals & Design	17	2454	September 9, 2020

What is the Julia equivalent of Rust Option<T>?

Related topics