The |
character is used in a few different languages as a concise way to specify type unions, like in TypeScript, Scala, PHP 8+, Python 3.10+, and (sort of) Haskell. I was curious to hear what people’s initial thoughts would be on having this as a concise way to create Union
s. This would be completely backwards-compatible of course; Union
by itself would still be Union
.
Currently |
is used for OR, which has close connections to type unions in formal logic (h/t @mkitti in this). Because of this connection, |
is used for both OR and union in TypeScript, Scala, PHP, and Python.
Now, would it be possible to add this? From what I can see, there are no potentially ambiguous methods in Base
:
Current methods: (expand)
[1] |(a::FileWatching.FileEvent, b::FileWatching.FileEvent)
@ FileWatching ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/FileWatching/src/FileWatching.jl:48
[2] |(b::Bool, a::Missing)
@ missing.jl:175
[3] |(x::Bool, y::Bool)
@ bool.jl:39
[4] |(a::Missing, b::Bool)
@ missing.jl:174
[5] |(::Missing, ::Missing)
@ missing.jl:173
[6] |(::Missing)
@ missing.jl:101
[7] |(::Missing, ::Integer)
@ missing.jl:176
[8] |(a::BigInt, b::BigInt, c::BigInt, d::BigInt, e::BigInt)
@ Base.GMP gmp.jl:543
[9] |(a::BigInt, b::BigInt, c::BigInt, d::BigInt)
@ Base.GMP gmp.jl:542
[10] |(a::BigInt, b::BigInt, c::BigInt)
@ Base.GMP gmp.jl:541
[11] |(x::BigInt, y::BigInt)
@ Base.GMP gmp.jl:501
[12] |(a::FileWatching.FDEvent, b::FileWatching.FDEvent)
@ FileWatching ~/.julia/juliaup/julia-1.10.0+0.aarch64.apple.darwin14/share/julia/stdlib/v1.10/FileWatching/src/FileWatching.jl:79
[13] |(::Integer, ::Missing)
@ missing.jl:177
[14] |(x::T, y::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8}
@ int.jl:372
[15] |(x::T, y::T) where T<:Integer
@ promotion.jl:518
[16] |(a::Integer, b::Integer)
@ int.jl:1064
[17] |(x::Integer)
@ operators.jl:527
[18] |(a, b, c, xs...)
@ operators.jl:587
So it should be backwards compatible to implement for union types like this:
Base.:|(T1::Union{Type,TypeVar}, T2::Union{Type,TypeVar}) = Union{T1,T2}
This is literally the entirety of the implementation. This lets you do things like:
julia> Float32 | Float64
Union{Float32, Float64}
julia> Int8 | Int16 | Int32
Union{Int16, Int32, Int8}
julia> Vector{<:AbstractFloat} | Matrix{<:AbstractFloat}
Union{Matrix{<:AbstractFloat}, Vector{<:AbstractFloat}} # (alias for Union{Array{<:AbstractFloat, 1}, Array{<:AbstractFloat, 2}})
as well as in method definitions:
julia> function add(a::(Float32|Float64), b::(Float32|Float64))
a + b
end
add (generic function with 1 method)
julia> add(1.0, 1f0)
2.0
To me personally I find this “reads” better than
julia> function add(a::Union{Float32,Float64}, b::Union{Float32,Float64})
a + b
end
because it’s a bit closer to how I would describe this function in language: “add takes float-32 or float-64”, rather than “add takes the union of float-32 and float-64”.
This is especially true for nested types, like
julia> f(a::Tuple{Float32|Float64, Float32|Float64}) = (a[1]^2, a[2]^2)
compared to
julia> f(a::Tuple{Union{Float32,Float64}, Union{Float32,Float64}}) = (a[1]^2, a[2]^2)
Union
is a frequently occuring symbol in Julia code; actually much more frequent than many existing infix operators (see this). A common use-case in data science is tables of numerical data with missing values, like:
Schema{Union{Float32,Missing},Union{String,Missing},Union{Int64,Missing}}
This gets simplified to:
Schema{(Float32|Missing),(String|Missing),(Int64|Missing)}
or in vectors like Vector{Real|Complex|Missing}
. The parentheses are a stylistic choice but aren’t technically needed.
This can also be used for setting defaults:
f(; cleanup::Union{Bool,Nothing}=nothing) = ...
which becomes
f(; cleanup::(Bool|Nothing)=nothing) = ...
which notably moves the Bool
closer to the ::
. Quickly scanning the code, it’s easier to see that cleanup
is a boolean.
This syntax also has connections to existing Regex syntax, like:
julia> occursin(r"(A|B)", "A")
true
Which checks for matches to A
or B
.
Just curious to hear what people think and initial reactions.
I couldn’t find any existing discourse threads, but it seems @Mason had suggested this syntax in a PR comment in 2020, but it got buried in the main discussion (h/t @mbauman). As to why |
was not suggested earlier in Julia’s history, it is likely because it has only recently become dominant in other languages for representing unions.
- Edit: removed comment about operator precedence
- Edit 2: updated main implementation
- Edit 3: updated key text
- Edit 4: pointed out initial suggestion on GH
- Edit 5: added more examples
- Edit 6: Changed
Base.:|(t::Type, types::Type...)
intoBase.:|(::Type{A}, ::Type{B})
to reduce surface - Edit 7: added regexp example
- Edit 8: Fixed definition to also include
TypeVar
which is valid input toUnion