Fixing the Piping/Chaining Issue (Rev 3)

Personally I think { ... } would be ideal to replace all the ... end blocks in Julia. begin end is way too long for common block macros like chain, but IMO its replacement could be general to all macros, rather than privileging the chain use case.

let-end is a bit shorter than begin (:
DataPipes supports both (begin and let) for pipes, with corresponding differences in variable scopes.
The let-end semantics is arguably much more common and natural in this context, begin-end is only for cases when multiple variables are defined in the pipe and used later outside of it.

1 Like

so proposal 4 I am seeing scattered over a few comments: do all of the following

  1. merge #24990 to get underscore partial application
  2. merge Chain.jl into Base with three changes
    a. make @chain no longer thread automatically into first argument
    b. change @chain x begin ... end semantics to simply x chain ... end or x pipe end
    c. (maybe?) allow links in the chain to be separated by commas or pipes than only newline

FWIW, getting just 1 & 2 even without change would be already great (I find the syntax proposed by Chain.jl it already useful) :slight_smile:

Also,with 2.b) I realise the way to capture the output is not obvious. So at least, this aspect should be carefully thought through.

maybe x |> chain ... end is better

1 Like

@jar1: We already have a shorter alternative to begin a; b end for blocks: (a; b) :wink:

@jules: Appreciated. Considering that {...} parses differently than blocks anyway, that was a good move imo.

I do like the idea of having a more-verbose block syntax for multi-line {...}. Is there any analogy here? I don’t believe I’ve encountered any more-verbose block syntax for [...].

The let-end semantics is arguably much more common and natural in this context

@aplavin: I agree, I don’t like side-effects here; that’s why I’ve chosen to make x.{f, g} mean let it=x; it=f(it); it=g(it); it end.

This also makes its scope behavior consistent when you say: chain={f, g}; x.{chain}, because {f, g} means it->(it=f(it); it=g(it); it)

I’ve actually even gone a step further, and made it so that any local assignments have the local keyword prepended—that’s how much I dislike side effects :sweat_smile:

@adienes and @Barget: The thing you must understand, is that the behavior of underscores in PR#24990 is incompatible with their behavior in Chain.jl. For example, PR#24990 would propose that map(_^2, _) should create a function that behaves like x->map(y->y^2, x), whereas Chain.jl treats it as if it’s x->map(x^2, x).

You have two possible paths:

  1. Adopt underscore behavior like Chain.jl, which then makes it impossible to use _ for partial application (at least, if we are to demand any consistency in language behavior).

  2. Adopt underscore behavior like PR#24990. This allows underscore partial application both inside- and outside-of chains.

You choose path (2), as it offers the ability to write _+1 instead of Base.Fix2(+, 1), as well as opening up all sorts of other partial application applications (e.g. if a function composition fallback is implemented, you could write filter(_%3==0, arr)).

However, you soon run into the problem that you can’t make expressions where the same variable appears more than once: _+_^2 doesn’t behave as x->x+x^2 and instead throws an error. But you say, “This is a chain! Of course I meant the result of the last expression!”

You might go back and decide that you prefer (1). But that would prevent partial application syntax outside of chains, and you would burn _ on something which didn’t need it (remember that we’re creating a new context, in which we are at liberty to choose new keywords!).

So then you say, okay. Let’s choose path (2), and select a local keyword. You survey the landscape of identifiers. You consider , but it’s difficult to type. Then you realize, singular non-gendered object pronouns carry the exact meaning that you are after here. So then you arrive at it, and the exact behaviors of this proposal.

Which brings us back to:

Yes, but (a; b) is horrible. That should be syntax for FrankenTuples.

1 Like

Syntax appears to be free for FrankenTuple definition :wink: I think the first step would be getting the type into Base.

julia> :( (a,; b=1) ).args
2-element Vector{Any}:
 :($(Expr(:parameters, :($(Expr(:kw, :b, 1))))))
 :a

Is that a major problem for #24990? If that’s the case, why not _1, _2, … for multiple arguments?

No, that example is not a problem for #24990; it’s a problem for Chain.jl.

I have learned two things from this long saga of threads on chaining:

  1. Generic partial application is really hard to be consistent and correct in all cases, and will almost certainly not be coming to Julia anytime soon.
  2. Chain.jl is pretty darn good and does indeed cover the majority of use-cases

So with that being said, I’d love to throw one more proposal into the wind, which is just “Chain But Not A Macro” and it introduces the block chain ... end, which I view as a sister to do ... end with the following differences from Chain.jl. I know this is redundant with my comment above but just repeating with more thoughts added

  1. chain ... end always returns a function, and the input is accessed as a _ like any other line. That is, @chain df begin transform(...) end becomes df |> chain transform(_, ...) end
  2. explicit underscores are needed at every line to avoid arguments over whether first or last position is more important
  3. semicolons ; can be used instead of newline to continue the chain

This gives some limited power for partial application inside chain blocks without opening up the whole can of worms that are the issues outlined in #24990. I think of more like a convenient new syntax for some function definitions than as wholly new machinery.

For example, here is one thing you can do with these changes more easily than current Chain.jl since it takes only a single line and the underscore can come in the first line

l2norm = chain _.^2; sum; sqrt end

Where now l2norm is a function.

And if you really want, it can indeed be used for partial application like
filter(chain _%3==0 end, arr), although I’m not sure this is better than the -> syntax. I suppose it avoids the choice of a variable name.

I must agree with @Sukera that we are possibly starting to go in circles, although I do not share the pessimism towards all the proposals, so this will probably be my last thought on the issue :slight_smile:

4 Likes

Btw, that’s already l2norm = @f __.^2 |> sum |> sqrt with DataPipes (:

1 Like

This is categorically incorrect. Partial application is very simple and straightforward.

What becomes difficult, is when it’s desired to cascade multiple function calls in sequence. Most of the controversy in #24990 has revolved around how to use _ to satisfy this desire, of building “quick lambdas” (i.e., not partial application), and to do it at the parser level. Unsurprisingly, that’s difficult.

However, as has been explored here (and expanded upon here and here), it works quite handsomely when paired with function composition. The relevant discussion is here.

That depends on how rational the Julia community can be. It seems that circling the wagons to make underscores behave as they do in Chain.jl has a tendency to whip people up into a fervor, making things difficult :sweat_smile:

By my proposal, l2norm = {it.^2, sum, sqrt}, and if PR#24990 is accepted, l2norm = {_.^2, sum, sqrt}, so I don’t see where your proposal adds any utility (other than defending your persistent desire to use _ as a stand-in for it).

Then I shall draw your attention back to this:

1 Like

Personally, I kind of like the {} syntax too. But chain ... end feels closer to the existing patterns in Julia. If { ... } and chain ... end are synonyms then I do not really have a preference over them.

defending your persistent desire to use _ as a stand-in for it

:sweat_smile: I’m sorry, possibly it’s just pure aesthetic subjectivity, but I really think that it looks rather inelegant. I don’t like how the code reads; it makes me feel like I’m playing one of these

using an underscore just makes a lot more sense as a placeholder variable to me. It doesn’t help that it is likely frequently used as a variable name for e.g. iterands

8 Likes
We are making progress! I'm not averse to subjectivity, as long as I understand where it's coming from.

In the debate of Bayesianism versus Frequentism, I fall into the Bayesian camp. Try as we might to find objectivity, we never quite get there. The pursuit of objectivity has, of course, proven generally worthwhile, but ultimately we fool ourselves if we do not acknowledge that every view we hold and every decision we make is subjective, a result of conjugating our observations with our priors along with a mishmash of heuristics and animal instincts.

While I have no sympathy for the masses who misuse it for iterators (they should be using itr), I do empathize with this sentiment. it requires you to dot your i’s and cross your t’s :sweat_smile:

It’s difficult to convince me that _ should be spent on chains, considering that its deprecation as an rvalue makes it a perfect fit for partial application—an idiom which is quite common and useful, and for which it’s also already used in Scala (so there is precedent). Underscores are such a perfect fit for partial application syntax, that the fit is far better than OJ’s glove. Since we are at liberty to select our own keywords within the context we are creating, it feels wasteful indeed to lay claim to the underscore.

You’ve previously raised the possibility of instead of it, and my pushback has been that it a) perfectly carries the desired meaning and b) is more readily accessible in ASCII characters. However, on further thought, I think it’s feasible (and Julian) to make them synonyms, in the same way that and in are synonyms, as are and <=. (Infact, I use the Unicode characters so much that I momentarily forgot what <= means :sweat_smile:)

So I raise this possibility: to keep the meanings for it and them to be local keywords as proposed, and to make a synonym for it, and ⬚s (plural of ) a synonym for them. I’d love to hear your thoughts.

1 Like

True, but I cannot put this inside an argument

map(@f __.^2, [1,2,3]) #error
map(chain _.^2 end, [1,2,3]) #[1, 4, 9]

Of course in simple examples like these the -> seems obviously better. But I could imagine making this multi-line

map(myarray) do chain
    ...
end

As a regular macro, I think you can write that like

map(@f(__.^2), [1,2,3])

It’s a little known feature that you can “call” macros like that, to explicitly pass their argument expressions instead of inferring them from the parsing rules.

2 Likes

This is a common usecase in data processing, and DataPipes aims to make it convenient!
The __ placeholder as the inner function argument starts the inner pipe. Eg:

@p map([[1,2], [3]]) do __
	map(_ + 1)
	__.^2	
	sum
end

We’re getting off-topic again, so I’ll leave this here.

@adienes again this question for you:

Doing Advent Of Code today, I remembered another issue with the curly braces - they are currently used in the Base.Cartesian module for some expansions.

julia> @macroexpand Base.Cartesian.@nexprs 4 i -> r_{i+10} = i^2
quote
    r_11 = 2
    r_12 = 4
    r_13 = 9
    r_14 = 16
end

So not breaking that should be kept in mind :slight_smile:

1 Like