Fixing the Piping/Chaining Issue

ah, makes sense, my mistake

1 Like

I could make examples with foo, bar, and baz, but then it would be harder to understand and appreciate. The reason this proposal is not “clearly useful” in the examples you highlight, is because the base functions have had currying specializations written; but that’s not something you should expect from arbitrary libraries.

2 Likes

I agree that Julia could use a more powerful interface for piping and currying. However, I still prefer PR #24990 (in it’s original, simple, tight-binding version).

I find it quite difficult to remember which operator is front fix and which operator is back fix. If I had guessed, I would have guessed that \> is front fix and /> is back fix.

In my opinion, one important issue with this proposal is that \> and /> look like piping operators, but in fact they are currying operators. Piping and currying are different things, and I think we should retain the difference. Here are a pipe function and a curry function that illustrate the difference:

pipe(a, f) = f(a)
curry(a, f) = (arg, args...) -> f(a, arg, args...)
julia> foo(x, y) = x + y;

julia> pipe(10, x -> foo(x, 20))
30

julia> curry(10, foo)(20)
30

We see that pipe immediately evaluates the function f on the argument a, whereas curry returns a new function that represents a partial evaluation of f. As a side note, Elm and Haskell retain piping and currying as separate concepts. I’m more familiar with Elm, so here’s an example in Elm:

> foo x y = x + y
<function> : number -> number -> number
> 10 |> foo 20
30 : number

In this example, currying is represented by foo 20 and piping is represented by ... |> .... Granted, the difference is a little murky in Elm, since applying a function f on an incomplete argument list is exactly how you do currying, and x |> f can be read as “apply f to x”. Yet they still include piping as a separate syntax.

Another fundamental issue with this proposal is that the proposed operators are right associative. This means that in many cases you need to read a “piping” expression from right to left in order to correctly interpret it, which defeats the purpose of piping, which is to read from left to right. The base piping operator is left associative, so you can in fact read base piping expressions from left to right.

Another even more fundamental issue with your proposed implementation is that it doesn’t work. The fact that it doesn’t work is hidden by some macro magic in your implementation. Observe:

julia> foo(x, y) = x + y
foo (generic function with 1 method)

julia> fixdemo"10 /> foo(20)"
30

julia> FixFirst(foo(20), 10)
ERROR: MethodError: no method matching foo(::Int64)
Closest candidates are:
  foo(::Any, ::Any) at REPL[36]:1

This happens because you need to supply a function to FixFirst, but foo(20) is not a function. We need a way to turn foo(20) into a partially applied function. How about foo(_, 20)? :joy:

Responses to specific points

Regarding tab completion: I don’t think tab completion should be a guiding principle when deciding the best way to do piping and currying.

I don’t think this is such a strange justification. Even for map(f, x), you can’t really say one argument is more important than the other. They’re both equally important. I don’t think the do syntax singles out the first argument as being the most important argument. It’s just a handy syntax that takes advantage of the fact that in most functional languages higher order functions usually accept a function as the first argument.

Actually, PR #24990 allows you to use multiple underscores to leave multiple arguments unevaluated. So this,

foo(_, b, _)

is equivalent to this:

(a, c) -> foo(a, b, c)
12 Likes

Sidenote, it’s fun to observe how OO and functional folks go for each others’ throats, refusing that anyone should write code in a style other than their own. The OO guys who think a function’s first argument should be the object it operates on, and the functional guys who think a function’s first argument should be the function it operates on.

And the folks who like both and think each school of thought has merits in different contexts, who then get attacked by both. It’s like politics. :sweat_smile:

By forward slash and back slash. But as mentioned, the specific symbol isn’t set in stone.

This is a concern of mine too. If I had my way, we’d just deprecate the tired old dog |> for how lame it is :stuck_out_tongue_closed_eyes:

I kid, I kid. Or do I? :thinking:

In any case, there’s no way that’s gonna happen so I’m open to input.

While individual arguments (intra-call) are fixed in a right-associative manner, chained function calls (inter-call) are left-associative. Usually we only care to fix one or two arguments (fixing more than that is more convenient to do with standard anonymous function syntax), so my hope is that being right-associative shouldn’t be too mind-blowing.

This is incorrect. The correct interpretation of 10 /> foo(20) is not FixFirst(foo(20), 10), but instead FixFirst(foo, 10)(20). That is to say, 10 /> foo is a partially applied foo that operates as (args...; kwargs...) -> foo(10, args...; kwargs...).

Remember that /> and \> as specified have higher precedence than function call. It’s similar to how the do statement rips the function’s arguments away from its name before giving them back with a new argument inserted.

I have mixed feelings about this.

On the one hand, the features of a language should, in theory, be fully orthogonal to the capabilities of any particular IDE.

On the other hand, the entire purpose of a language is to facilitate communication between the human and the computer, and the IDE is an integral part of that process; language features which help the IDE to help the human can be a competitive advantage in how productive the act of programming is (hence why TypeScript is so popular despite being “just” a wrapper for JavaScript). And as programmers are human, they must eat, and to eat they must be competitive. For a language to survive, the people who use it must survive (this applies to spoken language too).

In this case, when confronted with two approaches which do mostly the same thing and are as theoretically clean as each other, my leaning is to choose the one which improves efficiency and offers more information for autocomplete to work with, even if the other is more aesthetic.

I think it’s a general statement that, no matter the specific implementation of the IDE, it can do more if more information is presented to it. This will probably continue to hold into the future, even as our IDEs increasingly adopt AI to infer what method we’re likely to call.

The do statement doesn’t exist in a vacuum; it also encourages making a function the first argument.

I’m not arguing that the other arguments are not necessary for the function call! Obviously they are. Rather, I argue that the first argument is being treated in a privileged or special manner, and that it is systemic to the language.

Not only that, but it’s okay; we shouldn’t fool ourselves and pretend otherwise. Argument lists are ordered, and in cases where the ordering can be meaningful, we should attempt to express that meaning in the ordering. It’s good practice anyway.

Ah, I had missed that. In that case, if implemented, _ underscore placeholder syntax can indeed do something that frontfix /> and backfix \> cannot (namely, to partially evaluate non-contiguous sets of arguments).

If placeholder syntax will allow varargs, perhaps by _..., and kwargs, maybe with ;_..., then what can be done with frontfix and backfix will strictly be a subset of what can be done with placeholder syntax.

From there, it seems a matter of evaluating the use cases and weighing the tradeoffs.

In my view, creating a partial evaluation of non-contiguous arguments of a function is specific enough to justify use of standard anonymous function syntax, and the question returns to largely one of having typed partial functions and better autocomplete versus having better aesthetics.

2 Likes

Sorry, I had missed the fact that the *specific* example you were quoting here is indeed a very contrived use—possibly the most contrived of any in this entire thread—of the fixing operators. Any piping, chaining, threading, currying, or partial application method would struggle to find justification for such heavy use in this example.

It should be easy enough to find examples in the wild. For chaining examples, look for anywhere Clojure’s threading macros get used, anywhere people use any of the chaining macros I listed in the OP (there’s a dozen, and I included links), or in OO languages you will find examples everywhere (and if you want to go absolutely bonkers, look up jQuery). For examples of partial application, look for uses of Base.Fix1 and Base.Fix2, which are heavily used in Base. The proposed operators will serve all these uses and more.

I have a hard time remembering which slash is forward and which slash is back. :sweat_smile:

I’m not following you here. Can you elaborate?

What does that actually mean? /> is a function call, so does /> have higher precedence than itself? If we take a closer look at the operator precedence table, we see that “function call” does not have a precedence level in the table. The point of the precedence table is to tell us where to put parentheses in expressions involving multiple infix operators. However, function calls are not infix; function calls come with their own parentheses! If we didn’t have infix operators, we wouldn’t even need a precedence table.

Let us suppose that we have a binary function f, and that we can write f using infix notation. So, f(a, b) == a f b. And let us suppose that we modify your rule to “/> and \> have higher precedence than every other function call”. Let us consider the expression a /> f(b, c). As I mentioned, function calls have their own parentheses, so if we replace the function call with the infix operator f, we technically should have a /> (b f c), which won’t work because the right side of /> is not a function. But let’s ignore that for a moment and suppose that we have the expression a /> b f c. The precedence rule “/> and \> have higher precedence than every other function call” means that a /> b f c is equivalent to (a /> b) f c. But that won’t work because, again, b is not a function!

So, the rule that you have in mind is not a precedence rule. What you need is a rather unusual syntax transformation in the parser. The syntax transformation that you say you want is to transform

/>(10, foo(20))

into

(/>(10, foo))(20)

Or using infix notation for />, we transform

10 /> foo(20)

into

(10 /> foo)(20)

That’s an odd transformation, in my opinion. In fact, you’ve snuck in a function evaluation that is not specified anywhere in your proposal in the original post. I think the syntax transformation rule that you actually need is to transform

a /> f(b, c)

into

c /> b /> a /> f

In words, this syntax transformation could approximately be described as “decompose a function call f on the RHS of /> into a reverse-ordered chain of /> operations, ending in f”. That being said, I’m not a fan of the syntactic gymnastics that we have to go through in order to make this concept work.

Let’s take a look at this in action. I’m going to use ++ for front fix and --> for back fix, and I’ll manually place parentheses to enforce right associativity, so we can avoid using the fixdemo macro.

Here’s the implementation:

struct FixFirst{F, X}
    f::F
    x::X
end

struct FixLast{F, X}
    f::F
    x::X
end

(fixer::FixFirst)(args...; kwargs...) = fixer.f(fixer.x, args...; kwargs...)
(fixer::FixLast)(args...; kwargs...) = fixer.f(args..., fixer.x; kwargs...)

++(x, f) = FixFirst(f, x)
-->(x, f) = FixLast(f, x)

And here’s a function foo:

foo(x, y) = x - y

Now, as expected, 10 ++ foo(20) doesn’t work unless the parser does a rewrite of the expression:

julia> 10 ++ foo(20)
ERROR: MethodError: no method matching foo(::Int64)
Closest candidates are:
  foo(::Any, ::Any) at REPL[7]:1

So, the parser must rewrite that expression to this:

julia> out = 20 ++ (10 ++ foo)
FixFirst{FixFirst{typeof(foo), Int64}, Int64}(FixFirst{typeof(foo), Int64}(foo, 10), 20)

Note that this returns a curried function. It does not return the value -10. You have to evaluate the function out (it is now a zero-argument function) to get the result -10:

julia> out()
-10

If all you have is currying, all you can get out is a function—you cannot get out a function evaluation. Your fixdemo macro somehow sneaks in a function evaluation somewhere (which is not specified anywhere in your original proposal).

By the way, I don’t think partially evaluating a one-argument function to get a zero-argument function should be allowed. In fact, with the underscore currying notation, it is impossible to do that, and Base.Fix1 does not allow it either:

julia> Base.Fix1(sqrt, 10)()
ERROR: MethodError: no method matching (::Base.Fix1{typeof(sqrt), Int64})()
Closest candidates are:
  (::Base.Fix1)(::Any) at operators.jl:1096

Let’s return to the expression 20 ++ 10 ++ foo (which for now we have to manually annotate as 20 ++ (10 ++ foo)). This “piping” expression demonstrates what I mentioned in my previous post: You have to read this pseudo-piping expression from right to left. If you try to read it from left to right (which is supposed to be the whole point of this proposal), you get an error:

julia> out = (20 ++ 10) ++ foo
FixFirst{typeof(foo), FixFirst{Int64, Int64}}(foo, FixFirst{Int64, Int64}(10, 20))

julia> out()
ERROR: MethodError: no method matching foo(::FixFirst{Int64, Int64})
Closest candidates are:
  foo(::Any, ::Any) at REPL[7]:1

When we pull on this thread, the whole thing unravels.

1 Like

This is a well-stated criticism and it made your concern more clear to me. However I don’t think it means we need to throw out the entire idea. In particular, I disagree with the sentiment of this statement

I don’t think tab completion should be a guiding principle when deciding the best way to do piping and currying.

I think method discoverability (and with it: tab completion) is a genuine and major pain point for many new users. There are languages with more perfectly pure/elegant currying semantics, but not many people seem to want to build a VSCode extension for a Lambda calculus, and the interactive coding experience is just as much part of the language as its syntax. In fact, Julia itself has made this tradeoff multiple times in some design choices to make REPL workflows smoother. Not that the following is a particularly useful/robust metric to optimize for, but just check out

Where 10% seems like a pretty high fraction of discussion. This proposal (or maybe something similar to it minus some of the other powerful properties it has) seems to be the cleanest idea to address this issue that I have seen in similar threads.

Also the fact that Fix1 and Fix2 are so frequently used in the wild suggests that an infix operator for something like FixFirst and FixLast could kill two birds with one stone. So while I definitely appreciate these concerns you are bringing up and think they are valid, I’d love to see your thoughts from a “yes and” mentality in case you think there is any way to salvage the proposal.

3 Likes

To me this gets to the heart of my objection. What is really desired is syntax for which a macro is appropriate, and has zero runtime cost, but what we get with this proposal is a bunch of runtime constructed Functors that do have runtime associated cost.

When we say

@chain foo begin
bar(baz, quux)
end

What we mean is bar(foo,baz,quux) which is perfectly representable as an expression

What we mean when we say

foo \> bar(baz,quux) 

Is

\>(foo,bar(baz,quux))

Which evaluates at runtime to a struct that has a callable method that calls essentially bar(foo,baz,quux)

Consider the following:

[1, 2, 3] \> filter(isodd) \> map(sqrt) /> join(", ")

This is equivalent to join(map(sqrt, filter(isodd, [1, 2, 3])), ", "), but unwrapped so the expression reads left-to-right as a chained sequence of function calls in the order of their execution. It executes as

FixFirst(join, FixLast(map, FixLast(filter, [1, 2, 3])(isodd))(sqrt))(", ")

To show how intra-call operator behavior is right-associative, it can also be written as:

[1, 2, 3] /> isodd /> filter() /> sqrt /> map() \> ", " \> join()

Here, we have pulled the arguments out of each function call, and now you begin to see the right-associativity. Let’s walk through how this will execute.

When calling filter, isodd fills first into first argument position, then [1, 2, 3] fills into the next first position, and then () calls the function. This works as FixFirst(FixFirst(filter, isodd), [1, 2, 3])(). Let’s call this result1. The remaining expression is:

result1 /> sqrt /> map() \> ", " \> join()

Next, when calling map, sqrt fills into the first argument position, and then result1 fills into the next first position, and then () calls the function. This works as FixFirst(FixFirst(map, sqrt), result1)() Let’s call this result2. Now we have:

result2 \> ", " \> join()

Next, when calling join, ", " fills into the last argument position, and then result2 fills into the next last position, and then () calls the function. This works as FixLast(FixLast(join, ", "), [1, 2, 3])().

So as you can see by this example, the way that arguments are fed into an individual function call by these operators reads right-to-left (think of it like “whoever’s closest to the actual function gets there first”), but the overall chained expression of multiple function calls is still read left-to-right.

Apologies, maybe I didn’t use the right word. I don’t know the exact vocabulary, as I’m not a parser guy.

/> and \> bind more tightly than the function call, so that even though a function would normally be called by following its name with () parentheses, its execution is delayed, the fix operator is called first, and then the resulting functor is instead called by the parentheses.

Indeed, as is the one triggered by the do statement.

Why, when I can simply transform a /> f(b, c) into FixFirst(f, a)(b, c)?

It’s possible to rewrite it your way if you insist, but it would be (c /> b /> a /> f)().

But then that would eliminate any benefits of a typed partial evaluation and require stranger things to be done to the parser.

The function evaluation was already there. In the example of a /> f(b, c), the function evaluation was invoked by (b, c).

Why not? Serious question.

Base.Fix1 was written because people liked Base.Fix2, which itself was just meant for partial evaluation of 2-argument functions. These functions weren’t exactly architected with care to be generally useful. Yet, because the problem they address is so common, people want them to be generalized #36181.

Not if understood correctly.

1 Like

Not just new users of the language, mind you, but new users of any API written in it. This is important.

1 Like

Incorrect, foo \> bar(baz,quux) is (\>(foo,bar))(baz,quux), which is FixLast(bar,foo)(baz,quux).

If the type of foo is stable, constructing FixLast(bar,foo) can be done at compile time. This is no different from Base.Fix1 or Base.Fix2.

No, it’s equivalent to (ie. parses as):

\>(\>(\>([1,2,3],filter(isodd)),map(sqrt)),join(","))

And it evaluates to a struct.

You are ignoring the OP.

So you’re asking for parsing changes.

If that is needed, then yes.

I should note that the demo macro works without changes to the parser; the only reason I have it parsing a string is because /> and \> are not currently accepted operators, so I use string search and replace to temporarily hijack existing operators.

What’s the point of the whole Fix1 and FixLast baloney if you’re doing parsing changes? just parse

foo \>bar(baz) \>quux(foo2) /> bar2(baz2) 

as…

bar2(baz2,quux(bar(foo,baz),foo2))

(EDIT: I think I’ve fixed my errors in the transformation)
and forget all the Fix1 and such. The currying is irrelevant which is my point in the beginning, what is desired is parse-time rearranging of expressions, so macros are the best way to do this.

Sorry, I may be mixing up > and /> as @CameronBieganek also said he did.

(Note: when I see /> I think "stick the thing on the left into the position in / direction (ie to the right) and vice versa, but I think this is exactly opposite of your original idea?)

If changes to the parser are indeed required, which I can’t say one way or the other because I’m not a parser guy, I’d prefer it be minimal.

Moreover, as @adienes mentions, there aren’t too many times you have the opportunity to kill two birds with one stone. People want chaining and autocomplete, and people want partial evaluation, so

WhyNotBothPorQueNoLosDosGIF

This would evaluate as bar2(quux(foo2, bar(baz, foo)), baz2). As with anything new, you get the hang of it with practice. :wink:

EDIT:

Hm, if this is common then it might be better to flip the operators around and use \> for FixFirst, and /> for FixLast. I’m not picky one way or another.

That said, I suspect FixFirst will be more commonly used, and as /> is easier to type on conventional keyboards, I think I still lean toward the OP proposal.

Here’s a question for people that understand these things… is |> special syntax, or is it an actual function?

I’m not yet familiar with looking into the code of base julia stuff.

operators.jl line 911:

|>(x, f) = f(x)

I see, and that turns out to be in installdir/share/julia/base/operators.jl

I don’t understand how that works for something like:

a |> foo(b,c)

and yet it seems like it does. Does it only work when foo is a “currying function” (ie. has methods where foo(b,c) = x->foo(x,b,c))

1 Like