Summary of piping/chaining proposal?

There’s been a lot of posts about piping/chaining syntax, including:

The syntax is looking quite odd to my eyes and I’m really not following the thinking, but I’m wondering if someone—probably @uniment—could give as concise a summary of the current proposal as possible, especially separating the core idea from additional elaborations. The original idea of introducing \> and /> seemed pretty reasonable to me and now it’s veered into something I don’t get at all with curly braces and dots.

24 Likes

There have been a ton of ideas thrown around across those threads.

I believe the \> and /> thread (heh) of thought more or less culminated in the experimental JuliaSyntax PR here. It became very difficult to treat these two as generic partial application operators, but it seems they function pretty nicely as front-pipes and back-pipes, especially when they are allowed to be ‘headless’ for currying in simple situations like filter(/> foo, arr)

The other line of proposals are (in my own words, possibly @uniment disagrees) are essentially exploring syntax solely for composition, and and in a way hopefully compatible with whatever #24990 figures out, if ever, for partial application.

That is {f, g, h} is more or less synonymous with h ∘ g ∘ f, but there is some ability to have intermediate let blocks or pass into a specific argument position via the keyword it or an _. The dot syntax to call the chain I think was chosen largely because it was available and matches OOP languages, but I kind of prefer still using the pipes x |> {f, g, h}

Along those lines, I think a possibly more Julian / less controversial syntax would be simply to take the best parts of Chain.jl and DataPipes.jl into a built-in block type

chain
   ...
end

To be more or less synonymous with

x -> @chain x begin
    identity
    ...
end

And this is not incompatible with the aforementioned front/back pipes, nor with (most) of the proposed partial application solutions

3 Likes

It’s been quite a journey (for me, anyway)! I’ll summarize from my perspective, as the instigator and closest observer.

My intent from the start has been to find a chaining syntax which would be worthy of adoption into Julia, in large part to get better method discovery and autocomplete, but also because sometimes it’s simply more natural to express things this way (e.g., “the baby’s length” instead of “the length of the baby”). To me, the |> pipe operator fails at this primarily for four reasons: 1.) inability to specify more than one argument, 2.) low operator precedence, forcing the chain to be inconvenient as anything but a final operation, 3.) requirement to construct lambdas, hurting compile time, and 4.) terrible to type. What I have arrived at through this Odyssey is likely one of the most general chaining syntaxes in human history :sweat_smile:.

History: how the proposal evolved into its current form

First proposal:

I was hoping to kill two birds with one stone: to use partial application for chaining. (Also, I thought this would be easy :sweat_smile:)

I was a proponent of /> and \> (as syntax sugar for construction of FixFirst and FixLast partial applicator types), but that was until @CameronBieganek helped me realize that it didn’t quite work—not for partial application in the way I had imagined anyway. So, after learning more about PR#24990, I jumped ship for it as a more general partial application syntax, to the point of creating a generalized Fix partial applicator type for it (and doing benchmarks that showed favorable performance in comparison to Base.Fix1 and Base.Fix2). Sure, you’d live with some extra underscores, but the generality and transparency make up for it imo (and autocomplete would eventually make it a non-issue).

@c42f offered a JuliaSyntax demo showing how /> and \> could operate as partial applicators in a mirror form to how I had imagined (namely, to fix all-but-one argument), but by this point I had fallen out of love with them; I wanted chaining syntax which would work well with PR#24990 due to its greater generality. (Use of PR#24990 for chaining is essentially fixing all-but-one too, but without the constraint to first- or last- argument.)

Second proposal:

I pondered the issue, trying to understand what it was that people liked so much about Chain.jl, and I realized that its meaning for underscores, to be the result of the previous operation, is the exact definition of the English pronoun “it.” People love the concept of “it” because it allows us to do little tweaks here and there, allowing us to compose tasks which weren’t built to be composed. So I asked myself: Can I think of an unclaimed syntax which could work with PR#24990, and incorporate this meaning of “it” for more generalized function composition (the way our natural language affords us)?

So in the second proposal, I introduced the local keyword (unsurprisingly) it. I didn’t want its name to clash with _ underscore partial application, because they’re meaningfully different. But I really liked the extra flexiblity it provided, which is exactly what people like so much about Chain.jl (and which is, in my estimation, what made #24990 so difficult to push through).

For occasions where you simply wanted to call a function, you’d type its name—and possibly use underscores for partial application as PR#24990 proposes—and for those other odd cases where you wanted a bit more, you’d say it. So I chose an unclaimed syntax --() and bounding parentheses in which it would be defined. For example: x--(f, it+it^2, g(_, 2, 3)) would mean let it=x; it=f(it); it=it+it^2; it=g(it, 2, 3) end. For greater generality, I figured you might want to declare functions this way too, so I proposed a “headless” --(f, g, h) to mean it->(it=f(it); it=g(it); it=h(it)).

(Note: the direct substitution of g(_, 2, 3) into g(it, 2, 3), instead of g(_, 2, 3)(it), arose from @dlakelan’s continued prodding, which made me realize that partial application carried performance drawbacks, namely compilation time; it’d be preferable to do the substitution in-place if you know you’re simply going to consume the partial applicator anyway.)

Third proposal:

Some chatting with @christophE made me realize that not only is {} unclaimed syntax, but x.{} is unclaimed too. This made the hamster wheel in my head go crazy, because this a) requires no parser changes, so can be implemented today, and b) has exactly the operator precedence I want. So instead of x--(f,g,h) as in the second proposal, you’d type x.{f,g,h}, and instead of “headless” --(f,g,h), you’d write {f,g,h}. It’s a drop-in replacement for the second proposal.

But there’s a twist: {} is very powerful syntax; because it parses like [], you can construct 2-dimensional sets of expressions. I didn’t want to let such powerful syntax go to waste, so I asked the question: Can I meaningfully extend the concept of chaining to two dimensions? What would such a thing look like? Is it useful?

So in the third proposal, I dropped the discussion of partial application (to simplify the discussion), and I introduced some semantics for how expressions could spread across two dimensions. I also showed how you could implement a fast Fourier transform using these semantics.

And that brings us to today. Whew, that was actually kind of a lot :sweat_smile:

In short, the easiest way to imagine this proposal is taking the features of Chain.jl that people like, excluding parts that hurt its generality, including new things that extend its generality, and packaging it in a concise unclaimed syntax.

Core Behaviors:

Each expression is assumed to be either a function to be called, or an expression of it. (This is the same as Chain.jl, except using it instead of _.)

  1. x.{f; g} becomes a statement let it=x; it=f(it); it=g(it); it end. Notice the absence of a lambda, so there’s no compile-time penalty for using it.
  2. {f; g} becomes a function like it->begin it=f(it); it=g(it); it end.
  3. x.{f(it, y, z)} is let it=x; it=f(it, y, z); it end.
  4. {it+it^2} is a function like it->begin it=it+it^2; it end.

Notable decision points:

  1. I use it the same way that Chain.jl uses _, to mean the result of the previous expression. This is because I don’t want to claim _, so that it can remain free for use in partial application as PR#24990 proposes, and because the singular non-gendered object pronoun “it” carries the same exact meaning we’re after here.
  2. Simple chains, e.g. x.{first}.a to mean first(x).a, are possible because of high . operator precedence. I contend that this is an unalloyed good.
  3. Unlike Chain.jl which defaults to threading it into first argument position when it sees a function call, or DataPipes.jl which defaults to threading into last, I make no such assumption (this simplifies behavior to improve generality). Autocomplete will make this a non-issue anyway.
  4. Curly braces delimit the bounds of the chaining behavior. This enables single-argument “quick lambdas.”

Simple Extended Behaviors:

  1. Expressions are assumed either to be expressions of it, or to evaluate into functions to call on it. In cases where that’s obviously not true (e.g., :tuple or :generator expressions), no attempt is made to call them; they are simply assigned to it as-is.
  2. If there’s an assignment, then it is not assigned; this allows local variables to be declared. For example, x.{len = length(it); sum(it)/len} takes the mean of x by becoming let it=x; local len = length(it); it=sum(it)/len; it end.
  3. f(arg) do {g; h} end is an experimental alternate syntax for f({g; h}, arg) (I would prefer f(arg) do {g; h} but the parser doesn’t allow that.).
  4. recurse is an experimental locally-defined keyword which I haven’t talked about. Inside callable chains, e.g. {it ≤ 1 ? it : recurse(it-1)+recurse(it-2)}, loop is the function’s self-reference for recursion. This allows performant recursive chains (i.e., their self-reference is not boxed) to be assigned to non-const identifiers.

Advanced Extended Behaviors (Multi-Chains):

  1. For parallel chains of execution, Multichains are implemented. Multichains can be used to specify parallel execution threads/distributed processes, or for graphically arranging algorithms (e.g. my toy FFT demo).
  2. A value can be distributed across new chains by splatting .... If new chains start without any previous splat, then the right-most value is copied.
  3. To collect the values of the parallel chains, use a local keyword them: this will collect the parallel chains’ it values into a single tuple. Otherwise, when the number of columns reduces, any uncollected values will be dropped.

All keywords defined within the context of {} are it, them, and loop.

Most of the present debate seems to be either a) saber rattling that we should infact claim _ as Chain.jl does (and murder PR#24990), b) that the multi-chain behavior is too general and confusing, c) that curly braces are somehow not Julian, or d) that achieving the consensus to obtain a chaining syntax is a fool’s errand. I can definitely get onboard with a more verbose syntax for {} when multiline block expressions are to be made, but to me it seems silly to rally around banishing such a powerful brace syntax. And I’ve never had the wisdom to avoid a good fool’s errand :laughing:

As for murdering PR#24990… if the crowd chants loudly enough, then maybe the right move is to wash my hands like Pontius Pilate and order the execution. I’d like to believe not, but I am only one.

9 Likes

There are Base.Fix1 and Base.Fix2. However when piping functions involved usually have more than 2 parameters. IMHO, the proposed curly braces syntax {f, g} allow people to legally write

x |> {f(a, b, _,c), g(_, d, kwd=e)},

and don’t bother to define structure Fix3 such that f(x,y,_) :: Fix3{typeof(f)} in advance. Also, this is stay only on surface syntax.

There are not unique way to parse it though. One is

Another might be just

(call g (parameters (kw kwd e)) (call f a b x c) d)

I suggest we should keep using independent packages with macros to
implement candidate chaining/piping/… syntax options until a community
consensus on the best approach is reached.

I particularly dislike the use of “{” or “}” and the introduction of complex
syntax into julia and locking in these extra bracketing characters into
the language proper.

I have concerns that adding neat and cool language syntax that
creates dense, possibly difficult to follow code sections could
make the Julia language less successful or usable in the end.

There have been many interesting ideas presented and discussed.
Lets try them all out against eachother for a year to get things
stable and robust and then revisit what makes the most sense
for Julia going forwards.

3 Likes

These discussion have suggested that there might be a place in the language for a generic Base.FixAt{POSITIONS}(fun,values) function to generalize Fix1 and Fix2. For example, Base.Fix{(1,4)}(+,(11,14))(12,13,args...) == +(11,12,13,14,args...). This would at least make it slightly more ergonomic to chain via the existing |> in some situations.

Such a function should probably also be equipped to accept keyword arguments (fixed or nonfixed, probably with nonfixed overriding fixed), although there could be some debate on syntax there. It might also be useful to allow from-end-positional arguments to be fixed, but I think those would not be widely used and could be messy to work out if combined with front-indexed fixes (as collisions would be possible – although those could simply result in errors).

I kinda dislike the type of code this leads (me) to, when writing actual pipelines. Let me illustrate with a simple example. (Feel free to provide corrections or better style!)

process_list = list ->
  list.{
    map(convert(Float32, {it}), it),
    filter({it > 0}, it)
  }

(I’m assuming that each gullwing nesting leads to a new unique it. Correct?)

A counter-suggestion (other symbols might be preferrable):

  1. Front/back passing (let binding shorthand):
    a _> f(b,c)
    c >_ f(a,b)
    f(a,b,c)
  2. Shorthand lambdas:
    it > 0x -> (x > 0)

Same example: (>_ and _> can be read as “goes into” or “smart pass” or similar.)

process_list = it >_
  map(convert(Float32, it)) >_
  filter(it > 0)

Here’s an important part: Both versions have an error (same one). Did you already notice it? If not, can you find it? Can you fix it? Solutions below. Try to solve it first, in your preferred version! :wink: (Assume that all definitions work as expected; this is a usage error.)











# 'it' is shorthand for x -> (x …) in the gull-less suggestion, so
convert(Float32, it) == convert(Float32, x -> (x)) ≠ x->convert(Float32, x)

# In short, a naked 'it' inside a function does not change the signature
# in the gull-less suggestion.

# Fixed versions:
process_list = it >_
  map(it >_ convert(Float32)) >_
  filter(it > 0)

process_list = list ->
  list.{
    map({convert(Float32, it)}, it),
    filter({it > 0}, it)
  }

Again, let me know if I’ve misunderstood the gullwings.

Almost. My third proposal is to make this legal:

x |> {f(a, b, it, c); g(it, d; kwd=e)}

or, in order to avoid compiling an unnecessary lambda and potentially suffer performance loss from its variable capture behaviors (and to have better [tighter] operator precedence),

x.{f(a, b, it, c); g(it, d; kwd=e)}

The key thing to note is that I’m explicitly avoiding claiming the _ character for the chaining syntax, because I think it serves very well for denoting partial application (which is a distinct concept from the it keyword, and would be very useful outside the bounds of {...}).

If this proposal is accepted and PR#24990 is accepted, then what you have written will be valid.



Considering the specifics of this problem (which I laid out in my first proposal), this doesn’t seem to be the best decision-making process here. I would propose instead accepting a chaining syntax[es] on a probationary basis for some period, i.e., with no assurance that the syntax[es] will continue to be part of the language after that period, and choosing in the end whether to keep it.

I do think we want to spend some more time and experience to make sure it’s robust and stable; I just don’t think that developing consensus without substantial firsthand experience as a language feature is meaningful; the signal-to-noise ratio there would be pretty low.

I will also note that the dominant chaining packages are actually not particularly conducive to the genericism desired of a language feature anyway; for example, semantics like in Chain.jl, which automatically threads into first argument position when _ isn’t specified, is convenient for DataFrames but isn’t particularly desirable for functional styles (for example, when using currying functions like filter(f::Function) or when using transducers as part of a chain); meanwhile DataPipes.jl, which automatically threads into last argument position, isn’t helpful for functions written in an object-oriented style like those for DataFrames, nor for the curried binary operators like >(5); these behaviors therefore take away from the genericism and composability you want of a proper language feature.

Example of Chain.jl poor behavior with Transducers
julia> using Chain, Transducers

julia> @chain collect(1:100) begin
           Map(x->2x)
           Filter(>(100))
           sum
       end
ERROR: MethodError: no method matching Map(::Vector{Int64}, ::var"#3#4")

julia> @macroexpand@chain collect(1:100) begin
           Map(x->2x)
           Filter(>(100))
           sum
       end
quote
    local var"##356" = collect(1:100)
    #= REPL[65]:2 =#
    local var"##357" = Map(var"##356", (x->begin
                        #= REPL[65]:2 =#
                        2x
                    end))
    #= REPL[65]:3 =#
    local var"##358" = Filter(var"##357", (>)(100))
    #= REPL[65]:4 =#
    local var"##359" = sum(var"##358")
    var"##359"
end
Example of DataPipes.jl poor behavior with curried operator
julia> using DataPipes

julia> @p begin
           10
           >(5)
       end
┌ Warning: Pipeline step top-level function is an operator. An argument with the previous step results is still appended.
│   func = ">"
│   args =
│    1-element Vector{Int64}:
│     5
└ @ DataPipes C:\Users\unime\.julia\packages\DataPipes\z06K1\src\pipe.jl:257
false

julia> @macroexpand@p begin
           10
           >(5)
       end
┌ Warning: Pipeline step top-level function is an operator. An argument with the previous step results is still appended.
│   func = ">"
│   args =
│    1-element Vector{Int64}:
│     5
└ @ DataPipes C:\Users\unime\.julia\packages\DataPipes\z06K1\src\pipe.jl:257
quote
    #= C:\Users\unime\.julia\packages\DataPipes\z06K1\src\pipe.jl:47 =#
    #= REPL[58]:2 =#
    var"##res#345" = 10
    #= REPL[58]:3 =#
    var"##res#346" = 5 > var"##res#345"
    #= C:\Users\unime\.julia\packages\DataPipes\z06K1\src\pipe.jl:48 =#
    var"##res#346"
end
How these should work
julia> using MethodChains, Transducers

julia> MethodChains.init_repl()

julia> (10).{>(5)}
true

julia> collect(1:100).{Map(x->2x);Filter(>(100)),sum}
7550

julia> @macroexpand collect(1:100).{Map(x->2x);Filter(>(100)),sum}
:(let it = collect(1:100)
      it
      it = (Map((x->begin
                      #= REPL[158]:1 =#
                      2x
                  end)))(it)
      it = (Filter((>)(100)))(it)
      it = sum(it)
      it
  end)

So far, I haven’t found any good reasons to resonate with this sentiment. Is it merely a protest over aesthetics? Also, there are many things in Julia *much* more complex than what I have proposed, which seems inevitable considering Julia’s target audiences.

We have only three sets of bracing characters available on our keyboards: parentheses (), square brackets [], and curly braces {} (four if you count angle brackets <>, but we put those to very good uses already). Banishing one of them from a desire for Python-zen seems non-Julian.

Julia chose to use Algol-like begin...end for block expressions, which was a wonderful decision because it made it more natural to use the same style for other blocks (e.g. let, if, etc.) and freed up {} for other things.

When I consider the uses for {}, they seem to be primarily for denoting unordered lists (for which we have Set() and don’t find useful enough to justify dedicated syntax), or for set-builder notation, or for denoting switching expressions (for which we have if...elseif), or for blocks of expressions (for which we have begin...end and (...; ...)). Compared with the hypothetical uses pondered here, using it for function chains seems the most interesting and useful.

Whereas (x,y,z) is great for assembling a collection of objects and f(x,y,z) for calling a function on it, {f,g,h} is nice for assembling a collection of functions and x.{f;g;h} for passing an object through it. It’s hard to ignore the beauty in the symmetry, if even with the . dot; it’s akin to the symmetry in Julia’s decision that functions be objects and objects be functions.



@mikmoore I wasn’t smart enough to understand FixArgs.jl, so in my second proposal I developed my own partial applicator type (which is accessible here), which I showed to have favorable performance compared to Base.Fix1 and Base.Fix2.

I made mine to allow nonfixed kwargs to override fixed, and allow from-end-positional arguments to be fixed; it works like Fix{positions, num_of_args}(fun, fixedargs...; fixedkwargs...), where num_of_args dictates the number of arguments in the final call (and a -1 value indicates varargs). You can run it like this:

julia> using ChainingDemo

julia> Fix{(1, 3), 3}(f, 1, 3; a=1, c=3)
f(1, _, 3, ; a=1, c=3)

julia> @underscores f(1, _, 3; a=1, c=3)
f(1, _, 3, ; a=1, c=3)

julia> FixFirst(f, "hi!")
f(hi!, _...)

julia> FixFirst(f, "hi!") |> typeof
FixFirst{typeof(f), String, NamedTuple{(), Tuple{}}} (alias for Fix{typeof(f), (1,), -1, Tuple{String}, NamedTuple{(), Tuple{}}})

julia> Fix{(1,-3,-1), -1}(f, :start, :nextnextlast, :last)
f(start, _..., nextnextlast, _, last)

julia> @underscores f(:start, _..., :nextnextlast, _, :last)
f(start, _..., nextnextlast, _, last)

julia> @underscores filter(_%3==0, 0:10)
4-element Vector{Int64}:
 0
 3
 6
 9

julia> @underscores (xs = (x=>x^2 for x ∈ 1:4); map(_[2]/2, xs))
4-element Vector{Float64}:
 0.5
 2.0
 4.5
 8.0

Note: from-end indices are allowed only with varargs.

Also note: if you try it now, pretty-printing doesn’t work in the REPL because I made Fix subtype Function; in the REPL, objects that subtype Function have their Base.show overridden, so you need to call show manually.

I agree that a generalized Fix type would be very useful, and I think PR#24990 would be very nice syntax sugar for this. I think these would be incredibly helpful in many contexts, most notably when used with filter and map.

However, I don’t think that this combined with |> should be the preferred way to make chains, because using a partial applicator in a chain constructs a partial functor that will be used just once and discarded. This is wasteful for compile time and memory. And as I’ve also opined, |> is awkward and has the wrong precedence for most uses.

Notice that I made _ work as part of a partial application syntax like PR#24990; I do still like it and want it.



@MattEri

referenced code

Your example is incorrect, and should instead be:

process_list = {
    map({convert(Float32, it)}, it)
    filter({it > 0}, it)
}

and if PR#24990 were accepted, and using the curried form of filter available in Julia 1.9 (and supposing a partially-applied form of map were to become available too, e.g. map(f::Function) = FixFirst(map, f)), it could soon read like this:

process_list = { map(convert(Float32, _), _), filter(_>0) }

This is the same concept as my first proposal, except using claimed syntax which would require parser changes and break many things. My first proposal chose /> and \> (which are currently invalid and thus unclaimed syntax) to avoid these problems.

We cannot claim it outside of {} because it’s a valid identifier frequently used for iterators, and therefore claiming it would be a hugely breaking change. The nice thing about underscore _ is that using it as an rvalue is unclaimed syntax (yet it parses as an identifier), which makes it super interesting: it can be claimed outside special braces without causing a breaking change.

However, we must draw a distinction between the concept of a “quick lambda” and the concept of “partial application.” The debate of PR#24990 has persisted for years because of a desire to use _ to build “quick lambdas” which can do more than just partially-apply a single function. The problem with this idea is that, at the parser level, it’s generally impossible to tell where the bounds of such a “quick lambda” should be (the parser is unfortunately not a mind reader, even if Julia makes it seem so).

This makes the desire to form “quick lambdas” purely on the basis of using a special identifier untenable. That said, for most of the things that a “quick lambda” is wanted for, i.e. a single argument that passes through a couple simple functions (and never reaching a reducing function which combines it with itself in any way, nor encountering any branching logic), the partial applicator functor can work if combined with a function composition fallback as described here.

For example, this works with the demo code of my second proposal, mentioned above:

julia> using ChainingDemo

julia> @underscores @show g = √(2_+3) > 5;
g = √(Fix{(1,), 2}(*, 2) + 3) > 5 = >(_, 5) ∘ sqrt(_) ∘ +(_, 3) ∘ *(2, _)

julia> g(10)
false

julia> @underscores filter(√(2_+3) > 5, 5:15)
4-element Vector{Int64}:
 12
 13
 14
 15

However, for more general “quick lambdas” where the object takes more than one path and it interacts with itself, or if there will be branching logic, you need some unclaimed syntax to set the bounds of the expression within which the identifier has the desired meaning. Conveniently, my third proposal defines such bounds with {}, allowing you to construct arbitrary quick lambdas using it as an identifier. For example, {it+it^2} constructs a function that works like it->it+it^2.

Also notable, when you consider the use of “it” in English, it is for exactly this purpose of defining how an object should relate to itself (as well as for argument threading).



Summary

I think that in a perfect world we would have:

  1. A Fix functor type for generalized partial application (e.g. my demo code or FixArgs.jl)
  2. Use of _ underscores in a partial application syntax, à la PR#24990 (or my demo code)
  3. A function composition fallback for partial functions when passed as arguments to other functions. As mentioned in the PR, that might be tricky; I don’t currently have a way to judge how tricky.
  4. Implementation of my third proposal, to use {} for a chaining syntax. Out of my three proposals, I maintain that this one is the best.

At least in this case, I suspect the path to making the optimal decision is different from the optimal path to making a decision. Accepting these ideas into the language on a probationary basis seems appropriate (after some more vetting, certainly).

3 Likes

I think, if I understand this issue correctly, this Clojure macro could inspire our solution.

I am not even sure, if this is about the subject, and I should go to sleep. :sweat_smile:

Has there been any movement on this as far as incorporating a solution into the base language? I followed the @uniment threads closely for a while and while MethodChains.jl is not immediately natural looking to me (especially the usage of it and them) I think the core idea is very powerful, extremely general, and solves several issues of chaining in the core julia language.

I haven’t seen much discourse since then, but I think integration of even some of the more basic features (e.g. arbitrary partial application) would be greatly appreciated by the community. There are so many different frameworks trying to solve these related issues, so there is obviously a community want/need.

I’ve seen @uniment mention the Lisp Curse which maybe this falls under; perhaps the Julia language is too powerful, allowing many different solutions to a similar problem to be rapidly developed, leading to a lack of consensus.

Just some ramblings from a curious observer and Julia fan.

2 Likes

I’m not sure I would call it a “problem” or a “curse” (I think the Lisp curse applies to “functionality”/libraries more than maybe new syntax?!). I.e. piping/chaining is already an alternative to something that just works(?). Yes, there are many alternatives already for that alternative syntax. A language with macros allows that, there’s no way to stop people to come up with new ways. [Also no way to stop people making libraries with just functions.]

What is only missing is a “blessed” way for that alternative, and maybe it’s not missing at all. I try to follow the discussion(s) here on it. It better to not add something to Base unless it’s perfect, and better than all the other alternatives…

I agree that the bar for Base should be extremely high. But no design choice is “perfect” or “better than all other alternatives” without qualifications. Everything has trade-offs. One such tradeoff is, by not having method chaining in Base, the tooling around the language can’t provide specialized support. As it stands, you either turn off “Missing symbol” checks or accept that your file will be filled with “errors” coming from Chains.jl 's use of underscores. It also means that, to read someone else’s code, you may need to read the manual of yet another chaining package.

There is also the mental load on new programmers to read about, weigh pros and cons of, and choose the right chaining solution for them. A better experience is to have a default option, and then, find an alternative that suits you better if you desire.

I think the bar should be related to a design’s genericity, flexibility, and impact on the developer experience. I think that there is room for debate/improvement on the margins for flexibility and genericity with these proposals, but the benefits to developer experience for just picking something that works and going with it is an important consideration too.

Put another way, by the nature of being the default, a Base implementation would have pros over alternatives.

4 Likes

Oh sure, and supporting arbitrary partial application is orthogonal to having neat short syntax for it! See eg Generalize `Base.Fix1` and `Base.Fix2` · Issue #36181 · JuliaLang/julia · GitHub and WIP Generalize Fix1 and Fix2 by goretkin · Pull Request #36180 · JuliaLang/julia · GitHub.

There are already multiple implementations in packages, and something this foundational is more important to have in Base compared to short surface-level syntax. One of the points of stuff like Base.Fix1 is that functions can dispatch on them – this doesn’t really work when there are several implementations of the concept and none in Base.

Moreover, the API design for this “generalized Fix” is more straightforward (IMO) than for neat surface syntax, it would be perfectly general, and foster composability. Meanwhile, for syntax there’s nothing wrong with having multiple domain-specific packages, and code written using either of them is composable with any other Julia code.

2 Likes

supporting arbitrary partial application is orthogonal to having neat short syntax for it

I totally agree; I’ve seen those PRs and the generalized package someone made for FixN, but its been several years with seemingly little/no movement. @mrufsvold said it well - although the bar for Base should be extremely high, there is no perfect implementation.

Core features such as method chaining/partial application having a dozen+ different community made solutions do make the language feel a bit “incomplete”. This isn’t inherently bad per-se, Python is probably the most widely used numerical language at the moment and is arguably only made “complete” (as a numerical language) by packages such as numpy. Unlike, for example, Dataframes in Python which have multiple implementations (pandas, polars, …), numpy is the de-facto solution for Python arrays. The difference with numpy is that there is a lot of groundwork and momentum there, so numpy is extremely robust and well-maintained. As amazing as MethodChains.jl by @uniment might be, I don’t know if he will continue to work on it for the next year, five years, or longer. I don’t know if it will break with new versions of Julia… This makes it a tough sell to incorporate into a productionized code. All current method chaining packages (afaik) are maintained by enthusiasts who have no obligation to continue working on it at any point. There is no NumFOCUS backed partial application solution… There is massive value in incorporating core features into Base; I understand that “clear & concise syntax” is very subjective, but I don’t see how that should stop us from adding general features such as FixN into Base.

I know this conversation is focused on method chaining but I see some of these themes as broader sentiments echoed within the Julia community frequently. I eagerly watched @uniment as he meticulously pushed this topic to its logical conclusion, and beyond :sweat_smile:; there has been lots of great back-and-forth discourse in those threads. If nothing else I hope I can at least respark that conversation because it would be a shame if those ideas were lost in the pursuit of a perfect solution. Perhaps community polls could help - I’d imagine something such as FixN would easily have tremendous support, but _ syntax being used for partial application? I don’t know, it would be great to find out.

1 Like

A small step forward would be to register @uniment MethodChains in the general registry. At the moment, because it’s not registered, it can’t be used in Pluto (unless one drops the builtin package manager) and in general it’s discoverability is limited.

Alas!, the package creator is not engaging with the package anymore.

2 Likes

@uniment Considering the functional nature of Julia, respectively the absence of object oriented idioms, is it sensible to assume, that MethodChains is semantical identical to pipelines, or is there a significant difference, that we should be aware about?

New Julia User Opinion

As someone who has recently (6 months ago) started using Julia and who would like to use Julia more, I would like to see some sort of chaining feature in the core language. I use Chain.jl and I am fine with it, but I would really like to have lsp support and we can’t have that with macros. I also followed @uniment proposal and I would be fine with that too. I would personally be fine with anything that would offer object oriented like intellisense on objects, because using methodswith doesn’t compare to having all that info instantly when using . in other languages.

Macro LSP?

I think one main difference with those packages are the fact that you get lsp integration, since those packages are staying within the core language syntax. All the pipe chaining packages have to leverage macros to go outside the core language syntax and ultimately will never feel quite as polished or part of the language.

Is there a way where we could provide some sort of API to the julia language server where packages that sort of create a DSL could take advantage of and provide that sort of LSP support for their specific syntax? Maybe it is a new lsp.toml file or something that when the JuliaLanguageServer.jl sees a certain keyword like @chain, it will allow for a different set of rules for the lsp.

Maybe that breaks down once you are trying to then use that variable that was defined using a macro, since how would the lsp know the type of that variable? Maybe it could be side stepped though by doing x::Int64 = @chain ... so the type would be known.

I just wanted to throw that idea out there…

The macro static analysis issue is oft debated! I think, ultimately, the issue is that there are very few people with knowledge of LanguageServer.jl, those people have limited bandwidth, and there are higher priority issues in the LanguageServer.jl/VSCode extension neck of the woods when they do have time to make progress.

1 Like

Note that Pluto understands macros somehow:


Here, X in the second cell is underlined – Pluto knows it’s a global variable defined in another cell. And clicking on that X properly highlights X definition in the first cell (screenshot shows the state after such a click).

Maybe Pluto’s approach can be (partially) applicable to LS as well?

1 Like

No, it can’t. Pluto is running the cells of your notebook, so it has the expanded macro executed. But LS can’t execute macros because 1) it is a static analyzer. 2) even if we decide to give it runtime info, the macros may contain effectful functions which is problematic.

I’ve proposed else where that users should be able to mark certain macros as expandable because they promise they are safe and effect free. But that’s very complicated and returns us to the problem of developer bandwidth!

2 Likes