How often do you use the |> operator?

In response to the home/end keys. Do you know that CTRL+A and CTRL+E jump respectively backward and forward on the line? This works in shells and the Julia REPL.

5 Likes

cough, Ć  la emacs, cough cough

2 Likes

The form above has a significant weak point of readability in the chaining variables, sometimes I use it with indentation vertically aligning the intermediate results variables, so for example seems you (1+11=author+upvoters) would have noticed that the given example accidentally do not chain result1 to result2, here is adjusted with my preferred readability.

# coder is not drunk, keep it aligned also vertically
                            result1 = function1(input_data)
        result2 = function2(result1)
a = gpu(result2)
1 Like

Good catch! In real life-examples I try to use descriptive variable names though, so typos like this are less of a concern.

1 Like

I wouldn’t rely on manual indentation like this. I prefer

gpu(
    function2(
        function1(
            input_data
            )
        )
    )

If we’re anyway stacking far out horizontally.

The many-lines simply nested form depend on taste, is not storing intermediate evaluations which sometimes is preferable (as was under consideration in this sub-thread),
The following one line seems to me fits just right the purpose of spacing

gpu( function2( function1( input_data )))

in place of the same

It might be subjective but in the one-line the closing parenthesis pictures immediately what is input_data and do not consume 7 lines of vertical space for something that to me do not seem a visual improvement.
I’m also influenced by lisp practice which welcome closing many parenthesis in the last line of the evidenced block (that can be navigated with % in vi or C-M-p and C-M-n in emacs).

For me, the big advantage in drafting code at least is that the pipe operator lets me write down code in the order I think about it. Sometimes I think ā€œhey, i need to apply function f to xā€, then I type that and then remember I need to pipe it into some new function next. Instead of adding a new line / local variable, or adding more parens and moving the cursor back, i can just add a pipe operator. For me, it adds flexibility that matches the way I tend to think, which is not always linear but sometimes is.

As far as good practice / how you should use it in a library…there seem to be a lot of opinions on that, I stay out of that and just try to keep my code relatively clean.

One related I found in using the Elm language for a website is a reverse pipe operator:

f <| 2+2 == f(2+2) == 2+2 |> f

Not really sure why, but it made it easier for me to see at a glance how my code was structured.

TL;DR: syntactic convention is just a tool to help translate human thoughts to code, so, use what seems clear to you and supports your thought process.

I use it every day. Usually when I am hacking away at something.

it’s a good way to separate types of functions in a long composition chain.

a = g(f(x)) |> i |> h

so if f and g are similar but I and h are not it can help break up the code a bit.

1 Like

I use DataFramesMeta.jl all the time and I personally love the @linq macro. I work a lot with survey data that is often tens of millions of rows by a few hundred columns in size and I’m typically measuring all sorts of different things that can be measured from the data so a typical operation looks something like this:

total = @linq data |>
    where(:foo .> 10) |>
    transform(bar = :baz .* 2) |>
    by(:blurb, total = sum(:blob)) |>
    sort(:total, rev = true)

In my actual work there are often many steps that one has to take to arrive at the correct answer so I like being able to have each operation separated out on its own line so that I can see exactly what steps were taken when I come back to the code at a later date. I really don’t like reading code where a bunch of intermediate/successive variables are created just to store one step of a data transformation process when it would be possible to chain everything together and do it at once and assign the result to a single logically-chosen variable name.

5 Likes

More so than the pipe operator, much of this discussion comes down to whether you prefer prefix or postfix functions. Julia’s functions are prefix, and using the pipe operator allows using postfix functions.

An advantage of postfix functions is the lexical order is the same as the execution order:

Prefix:

[parse(Int, x) for x in split(strip(foo))]

(hypothetical) Postfix functions:

foo.strip.split.map(parse(Int, _))

Postfix with pipes — very similar to postfix functions above:

@pipe foo |> strip |> split .|> parse(Int, _)

As pipelines get longer, the difference between lexical & execution ordering grows when using prefix functions. Here’s a pipeline that would arguably be very difficult to read with prefix syntax — but would by fine with the hypothetical postfix functions above:

x = @pipe foo |> strip |> split .|> parse(Int, _) |> sort |> [0, _..., maximum(_) + 3]
1 Like

I think the usefulness of |> stems from the ā€œalgebraicā€ property that is similar to matrix * vector:

(C' * B' * A')' * x == A * (B * (C * x))
x |> (f ∘ g ∘ h) == x |> h |> g |> f

In fact, if we define f <| x = f(x), the similarity is much clearer

(A * B * C) * x == A * (B * (C * x))
(f ∘ g ∘ h) <| x == f <| g <| h <| x

Or, equivalently, with the opposite composition operator g ⨟ f = f ∘ g which is similar to (_' * _')':

(C' * B' * A')' * x == A * (B * (C * x))
(h ⨟ g ⨟ f)(x) == x |> h |> g |> f

Given this observation, it is somewhat interesting that some comments against |> can also be applied to * (or any infix operators).

I think using binary operators is a good way for emphasizing the algebraic property of your program. This, in turn, can help readability and editability. (The argument is true for the other way around. It probably is a bad idea to use a binary operator if you don’t have any algebraic properties.)

8 Likes

I’ve begun to love it. A chain like

filter(!isempty, map((n,L) -> (n, strip(L), filter(x -> isodd(first(x)), enumerate(eachline(file)))))

is much more easily read as

eachline(file) |> enumerate |> filter(isodd ∘ first) |> map() do (n, L)
    n, split(L)
end |> filter(!isempty)

In fact, the chain reads like a series of assignments (which some would prefer, exactly because they are read in order).
Some languages allow for these chains with dot syntax. I think the |> operator is better.

4 Likes

Thanks for the replies.
From what I understand the use of |> is a matter of style, but many people don’t use the pipe operator because of its limitations (having to write a lambda every time the function to the right of |> takes more than one argument, or having to rely on external macros). I also don’t like writing lambda functions.


For example:

a = [1:100;]
a |> 
    x -> reshape(x, 10, 10)   |>    # reshape to 10x10 matrix
    x -> my_function(x, args) |>    # apply a function with other args to the matrix
    x -> reshape(x, size(a))  |>    # convert to original size
    x -> filter(iseven, a)          

What do you think about this? (it’s probably a stupid solution but it seems to work, but I don’t know if it’s a bad practice)

|>₁(a, f_args::Tuple) = f_args[1](a, f_args[2:end]...)
|>ā‚‚(a, f_args::Tuple) = f_args[1](f_args[2], a, f_args[3:end]...)
|>ā‚ƒ(a, f_args::Tuple) = f_args[1](f_args[2], f_args[3], a, f_args[4:end]...)
a = [1:100;]
a |>₁
    (reshape, 10, 10)   |>₁    
    (my_function, args) |>₁
    (reshape, size(a))  |>ā‚‚
    (filter, iseven)
5 Likes

Use Chain.jl. It solves a lot of these pain points.

3 Likes

Thank you, but I would prefer not to use macros from external libraries

Yea perhaps what this thread is still missing is a list of all the packages that improve upon the base pipe operator. To name a few:

11 Likes

I am curious about the reason for this.

I’ve sometimes tried to use the |> from Base, but then I run into cases when it doesn’t work, find I don’t understand why, and give up.

The examples in Chain.jl readme probably explain why, not 100% sure.

I’ve not used the ones in packages because I haven’t wanted to spend time to learn the differences and choose one. What criteria would I use for choosing anyway? For me it would serve best if there was a ā€community recommendationā€ to use one and not the others. (This is also true more generally.)

There’s also the minor point that to type | with a Finnish/Swedish keyboard requires option-7. The keyboard key itself has 7 and / on it.

There is an issue

https://github.com/JuliaLang/julia/issues/5571

and a proposal

https://github.com/JuliaLang/julia/pull/24990

1 Like

I’m a student and I don’t like people reading my code to have to know about libraries that aren’t strictly necessary.

4 Likes