Proposed alias for union types

“Julia veteran” is a fuzzy term, but note that very few of the people who would actually decide this commented here, and I don’t see them being enthusiastic about it.

FWIW, changing (or even extending) the surface syntax of a mature language is always tricky, because

  1. All the low hanging fruits have been picked a long time ago, and the status quo is either good enough for practical purposes or too costly to change,

  2. It is a topic that comes up so often that many core devs are, frankly, sick and tired of it. It is very easy to propose new syntax, so it happens all the time, but very few of these suggestions are implemented, especially in the last few years. This is a good thing: a constantly changing core language is the last thing that most users want.

  3. Even harmless-looking extensions are resisted because whatever new feature is introduced, it will have to be kept around until 2.0, which is definitely not happening in the near future. In the meantime, it has increased the complexity of the language, and has taken up useful syntax that cannot be used for anything else now. If you want to introduce new syntax, it it not sufficient to have a minor benefit, it has to be a major one; the bar is really high. But see (1).

So whether you make a PR depends on how much time you want to invest in this. Triage will evaluate it at some point and you will get an answer; but be prepared for the possibility that it is not accepted or languishes for a long time after a long and meandering discussion.

FWIW, I would do what @mkitti recommends above, ie work out the details in a package, experiment with it for a while (>10k LOC), and then make a PR to Julia proper if it is absolutely necessary and you need a change in the core language to make this work. Otherwise, a package is best.

8 Likes

A post was split to a new topic: Non-deterministic normalization of Tuples of Unions

Totally agree here, and I wish there was an actual requirement for all proposed function/type additions to pass through this process! (aside from compiler/internals)
Unfortunately, that’s far from being the case, but we can at least try (:

In this specific situation, create a small package like UnionSyntax.jl with:

  • Base.:(|)(a::Type, b::Type) = Union{a,b} definition
  • Corresponding show(Type) change

Yes, a bit of type piracy, but shouldn’t cause any issues.
Then start using this package in your code, in your other packages, add to startup.jl. Maybe others will also add it sometimes, maybe not – but the main point is to actually use it for a significant time in real code.

Then, after some time passes and you are happy with how it turns out, try making a PR to Julia. A function like this does make sense to live in Base and not an external package to get any real usage, but at that point you’ll have actual informed experience with it already. Even if the PR isn’t accepted, you still have UnionSyntax.jl to use!

2 Likes

It seems to me that this new syntactic sugar is perfect for testing whether the use of PEP-like mechanisms is useful in the development of Julia. In addition to the proposed syntactic sugar, there are two other good alternatives, and it is not clear (at least to me) whether any (or all) of the solutions are non-breaking changes. Given that the proposal wants to change a good status quo, it might be worth taking some time and developing a PEP-like mechanism for other proposals.

1 Like

I hate to repeat myself from upthread, but this is an unequivocally terrible idea for packages.

Pulling in a dependency is even worse than the type piracy, in terms of “weirdness tax” for tiny syntax improvement: Using such a package is almost never worth it.

Don’t go NPM leftpad. There is a minimal package size that is viable, smaller stuff needs to go into either julia base or into some larger semi-standard SyntaxSugars.jl from which julia base can then cherry-pick successful experiments.

I am mildly pro this proposed syntax (it has lots of precedent, and it is clearly slick and nice, and it is pure ascii).

But it really needs to go either into Base, or not at all. (FWIW, I like that part of the java JEP process with experimental APIs in each jvm release, gated behind command switches)

1 Like

You mean, to use it in major packages? Sure, better not to do that (:

But to use in personal code, and to let others easily try the suggested syntax/functions? This approach would be great! Definitely much better than pushing for a change directly into Base, without any actual usage beforehand. Even a little usage experience is better than none.

Would be interesting to see, but I guess such a package doesn’t exist for now? Or the link got broken.

How to advocate for this really depends upon how much you care and how much you time and energy you want to put in. The best way to truly get a complete evaluation would indeed be with a comprehensive pull request that implements the definition you want, complete with tests, news, etc., as well as a concise enumeration of the tradeoffs everyone brought to this thread so it’s easy to weigh the pros and cons. But that takes not-insignificant work… and it’s not guaranteed to ever fully resolve. For example, take the attempts at underscore anonymous functions.

If you want a quicker-but-less-definitive go or no-go: If I were you, I’d enumerate pros and cons as best and tersely as I can, drop by #triage on slack or attend a triage meeting, and ask for a quick discussion. The biggest question in my mind is just how “bad” Jeff et al would consider the badness of the | punning to be. Just like what happened in Python-land, though, the first reaction from a core person or two might not be a good predictor of what would happen with a more complete PR/proposal.

Language changes aren’t ever driven by pure democratic simple majorities — not even majorities of the committers, not even majorities of the crew on any given triage call. In my experience, the committers listen pretty closely to any and all objections (both from other committers and the community at large), and the status quo prevails in cases where there are not-insignificant objections. So I don’t have a crystal ball, but I’d temper your expectations (and investments) by re-reading your own post there in that light.

12 Likes

Thanks. I probably won’t have time to go through all of this then. But it’s great to hear there is interest. Maybe one day we can have this if someone wants to push it through. Unlike the underscore proposal, getting | implemented is just a one-liner with no parser changes, so at least the implementation itself is much easier.

1 Like

While the implementation is indeed simpler, the history of #24990 and its antecedents is informative because a lot of the ramifications only became apparent with a full implementation. For this proposal, the only surprises I can imagine are precedence, but since , and | have the same precedence a package can experiment with this freely. ([Obviously all bind tighter than ::, which commonly precedes Union, edit: this is incorrect] but is there something we haven’t thought of?)

I am not sure about this. Consider eg UnPack.jl, which provides a syntax @unpack a, b, c = container — while its implementation is more complex and it has extras, its core functionality was this one thing (also implemented by SimpleUnPack.jl. Hundreds of packages were using it before the (; a, b, c) = container) syntax was introduced into the core language.

Personally, I love simple packages with a simple interface. Another great one is ArgCheck.jl, which exposes a single macro @argcheck, providing nicer error messages than @assert.

In Julia, especially with the recent precompilation improvements, there is no practical downside to using a plethora of small packages. Small packages are also easier to maintain and improve, especially when the original maintainer is no longer available.

3 Likes

Maybe I’m misunderstanding you here, but they all bind looser, not tighter than :: which is IMO the only real complaint I’d have with a method like this.

julia> dump(:(a::b|c))
Expr
  head: Symbol call
  args: Array{Any}((3,))
    1: Symbol |
    2: Expr
      head: Symbol ::
      args: Array{Any}((2,))
        1: Symbol a
        2: Symbol b
    3: Symbol c

julia> dump(:(a::(b|c)))
Expr
  head: Symbol ::
  args: Array{Any}((2,))
    1: Symbol a
    2: Expr
      head: Symbol call
      args: Array{Any}((3,))
        1: Symbol |
        2: Symbol b
        3: Symbol c
3 Likes

Yes indeed. Thanks for the correction.

1 Like

I submitted a proposal for inlcuding \mid in Julia parser and setting it’s precedance higher than :: as an experimental feature:

In case people are interested, I posted a little package: GitHub - MasonProtter/InfixUnions.jl which has three options people can try out |, (spelt \cup<TAB>), and (spelt \vee<TAB>).

This package commits no piracy, on | or , but instead locally shadows them when you do using InfixUnions: |.


My take on this after playing with them a little bit is that is horrible, it looks way too similar to v and also takes too long to type, and the end result isn’t visually appealing. is more distinct from u and U in most fonts, but has the same problems as otherwise. I quite like | though.

11 Likes

I wanted to quickly append another example. Say you are writing a regexp for matching a string representation of the type in each row in a table. You might get something like this:

julia> occursin(
           r"(Int64|Float64),(Int64|Missing),Float32,(String|Missing)",
           "Int64,Missing,Float32,String"
       )
true

julia> occursin(
           r"(Int64|Float64),(Int64|Missing),Float32,(String|Missing)",
           "Float64,Int64,Float32,Missing"
       )
true

julia> occursin(
           r"(Int64|Float64),(Int64|Missing),Float16,(String|Missing)",
           "Int32,Int64,Float32,Missing"
       )
false

I think this is part of the reason I find it intuitive to write:

julia> typeof(row) <: Tuple{(Int64|Float64),(Int64|Missing),Float32,(String|Missing)}

If you think of <: as a sort of advanced pattern pattern-matching operator for spaces of types. Of course this analogy isn’t complete because occursin(r"Integer", "Int32") wouldn’t work in regexp, but if you think of Julia types as a much more powerful regular expression scheme – where Integer is automatically expanded to include types in the hierarchy (UInt8|UInt16|...) – perhaps the connection is a bit clearer.

Thanks for making this @Mason. It’s great that it’s possible to use | for union without a macro and there is no type piracy!

I would be happy to try this out for Union in SymbolicRegression.jl once your package is on the registry. I think SymbolicRegression package makes sense as a guineau pig for a few different reasons:

  1. No existing use of | (for bitwise or). (edit: not needed – see comment below this one)
  2. The package is a bit unusual in that most people who contribute to it come from Python (the thin Python wrapper PySR is actually the more popular interface), and SymbolicRegression is typically their first time seeing Julia code. Thus the | syntax might make their Julia onboarding a bit quicker.
  3. There is a fairly extensive testing suite, both unit tests, end-to-end tests, and integration tests in the Python frontend. Thus any issues with the syntax should be caught pretty quickly.
  4. There are a few macros which wrap code containing Union – both internal macros and also ones from MLJ.jl – so we should quickly identify whether there are any structural problems that show up with that.
  5. I count 9,015 LOC, so it’s close to @Tamas_Papp’s recommended codebase size for testing the syntax
4 Likes

Note that the | in InfixUnions is backwards compatible with the one in Base.

1 Like

I just wanted to post a reminder that I created an unregistered package, OrUnions.jl:

julia> using Pkg

julia> Pkg.activate(;temp= true)

julia> Pkg.add(url="https://github.com/mkitti/OrUnions.jl")

julia> using OrUnions

julia> @orunion function f(x::Int | UInt) # no extra ( )
           x^2
       end
f (generic function with 1 method)

julia> methods(f)
# 1 method for generic function "f" from Main:
 [1] f(x::Union{Int64, UInt64})
     @ REPL[7]:1

julia> Int ∨ UInt
Union{Int64, UInt64}

julia> using OrUnions: |

julia> Int | UInt
Union{Int64, UInt64}

julia> 2 | 3
3

I did not implement \cup, , because types are not sets. I do a Base forwarding | as well though.

The one thing I implemented that InfixUnions.jl does not have is the @orunion macro which attempts to change operator precedence. That means you do not need extra parentheses. If you can accept parentheses, then you can just use | or , no macro needed.

For comparison between the two packages.

  • Try InfixUnions.jl if you want a simpler package with importable , , or |.
  • Try OrUnions.jl if you want less parentheses with an exported macro, @orunion, along with an exported , and an importable |.
2 Likes

Thanks, I just now realized that you can do using OrUnions: | as well (for some reason I thought OrUnions was reliant on macros; but the macro bit is only if you want to avoid (|) syntax due to the looser binding issue). So I guess they are both options for trying this out.

1 Like

Cool :slight_smile:

Unfortunately I don’t see any compatible way to get behavior like this in the actual Julia compiler. Changing the operator precedence in the parser is breaking, and disrespecting the semantics of the AST in lowering is quite evil in any circumstance :sweat_smile:

3 Likes

I find the way Cthulhu.jl displays types a bit verbose, so I see value in any attempt to reduce it.

Could Cthulhu.jl and other similar tools use the parenthesis approach?

1 Like