How often do you use the |> operator?

In response to the home/end keys. Do you know that CTRL+A and CTRL+E jump respectively backward and forward on the line? This works in shells and the Julia REPL.

5 Likes

cough, Ć  la emacs, cough cough

2 Likes

The form above has a significant weak point of readability in the chaining variables, sometimes I use it with indentation vertically aligning the intermediate results variables, so for example seems you (1+11=author+upvoters) would have noticed that the given example accidentally do not chain result1 to result2, here is adjusted with my preferred readability.

# coder is not drunk, keep it aligned also vertically
                            result1 = function1(input_data)
        result2 = function2(result1)
a = gpu(result2)
1 Like

Good catch! In real life-examples I try to use descriptive variable names though, so typos like this are less of a concern.

1 Like

I wouldnā€™t rely on manual indentation like this. I prefer

gpu(
    function2(
        function1(
            input_data
            )
        )
    )

If weā€™re anyway stacking far out horizontally.

The many-lines simply nested form depend on taste, is not storing intermediate evaluations which sometimes is preferable (as was under consideration in this sub-thread),
The following one line seems to me fits just right the purpose of spacing

gpu( function2( function1( input_data )))

in place of the same

It might be subjective but in the one-line the closing parenthesis pictures immediately what is input_data and do not consume 7 lines of vertical space for something that to me do not seem a visual improvement.
Iā€™m also influenced by lisp practice which welcome closing many parenthesis in the last line of the evidenced block (that can be navigated with % in vi or C-M-p and C-M-n in emacs).

For me, the big advantage in drafting code at least is that the pipe operator lets me write down code in the order I think about it. Sometimes I think ā€œhey, i need to apply function f to xā€, then I type that and then remember I need to pipe it into some new function next. Instead of adding a new line / local variable, or adding more parens and moving the cursor back, i can just add a pipe operator. For me, it adds flexibility that matches the way I tend to think, which is not always linear but sometimes is.

As far as good practice / how you should use it in a libraryā€¦there seem to be a lot of opinions on that, I stay out of that and just try to keep my code relatively clean.

One related I found in using the Elm language for a website is a reverse pipe operator:

f <| 2+2 == f(2+2) == 2+2 |> f

Not really sure why, but it made it easier for me to see at a glance how my code was structured.

TL;DR: syntactic convention is just a tool to help translate human thoughts to code, so, use what seems clear to you and supports your thought process.

I use it every day. Usually when I am hacking away at something.

itā€™s a good way to separate types of functions in a long composition chain.

a = g(f(x)) |> i |> h

so if f and g are similar but I and h are not it can help break up the code a bit.

1 Like

I use DataFramesMeta.jl all the time and I personally love the @linq macro. I work a lot with survey data that is often tens of millions of rows by a few hundred columns in size and Iā€™m typically measuring all sorts of different things that can be measured from the data so a typical operation looks something like this:

total = @linq data |>
    where(:foo .> 10) |>
    transform(bar = :baz .* 2) |>
    by(:blurb, total = sum(:blob)) |>
    sort(:total, rev = true)

In my actual work there are often many steps that one has to take to arrive at the correct answer so I like being able to have each operation separated out on its own line so that I can see exactly what steps were taken when I come back to the code at a later date. I really donā€™t like reading code where a bunch of intermediate/successive variables are created just to store one step of a data transformation process when it would be possible to chain everything together and do it at once and assign the result to a single logically-chosen variable name.

5 Likes

More so than the pipe operator, much of this discussion comes down to whether you prefer prefix or postfix functions. Juliaā€™s functions are prefix, and using the pipe operator allows using postfix functions.

An advantage of postfix functions is the lexical order is the same as the execution order:

Prefix:

[parse(Int, x) for x in split(strip(foo))]

(hypothetical) Postfix functions:

foo.strip.split.map(parse(Int, _))

Postfix with pipes ā€” very similar to postfix functions above:

@pipe foo |> strip |> split .|> parse(Int, _)

As pipelines get longer, the difference between lexical & execution ordering grows when using prefix functions. Hereā€™s a pipeline that would arguably be very difficult to read with prefix syntax ā€” but would by fine with the hypothetical postfix functions above:

x = @pipe foo |> strip |> split .|> parse(Int, _) |> sort |> [0, _..., maximum(_) + 3]
1 Like

I think the usefulness of |> stems from the ā€œalgebraicā€ property that is similar to matrix * vector:

(C' * B' * A')' * x == A * (B * (C * x))
x |> (f āˆ˜ g āˆ˜ h) == x |> h |> g |> f

In fact, if we define f <| x = f(x), the similarity is much clearer

(A * B * C) * x == A * (B * (C * x))
(f āˆ˜ g āˆ˜ h) <| x == f <| g <| h <| x

Or, equivalently, with the opposite composition operator g ā؟ f = f āˆ˜ g which is similar to (_' * _')':

(C' * B' * A')' * x == A * (B * (C * x))
(h ā؟ g ā؟ f)(x) == x |> h |> g |> f

Given this observation, it is somewhat interesting that some comments against |> can also be applied to * (or any infix operators).

I think using binary operators is a good way for emphasizing the algebraic property of your program. This, in turn, can help readability and editability. (The argument is true for the other way around. It probably is a bad idea to use a binary operator if you donā€™t have any algebraic properties.)

8 Likes

Iā€™ve begun to love it. A chain like

filter(!isempty, map((n,L) -> (n, strip(L), filter(x -> isodd(first(x)), enumerate(eachline(file)))))

is much more easily read as

eachline(file) |> enumerate |> filter(isodd āˆ˜ first) |> map() do (n, L)
    n, split(L)
end |> filter(!isempty)

In fact, the chain reads like a series of assignments (which some would prefer, exactly because they are read in order).
Some languages allow for these chains with dot syntax. I think the |> operator is better.

4 Likes

Thanks for the replies.
From what I understand the use of |> is a matter of style, but many people donā€™t use the pipe operator because of its limitations (having to write a lambda every time the function to the right of |> takes more than one argument, or having to rely on external macros). I also donā€™t like writing lambda functions.


For example:

a = [1:100;]
a |> 
    x -> reshape(x, 10, 10)   |>    # reshape to 10x10 matrix
    x -> my_function(x, args) |>    # apply a function with other args to the matrix
    x -> reshape(x, size(a))  |>    # convert to original size
    x -> filter(iseven, a)          

What do you think about this? (itā€™s probably a stupid solution but it seems to work, but I donā€™t know if itā€™s a bad practice)

|>ā‚(a, f_args::Tuple) = f_args[1](a, f_args[2:end]...)
|>ā‚‚(a, f_args::Tuple) = f_args[1](f_args[2], a, f_args[3:end]...)
|>ā‚ƒ(a, f_args::Tuple) = f_args[1](f_args[2], f_args[3], a, f_args[4:end]...)
a = [1:100;]
a |>ā‚
    (reshape, 10, 10)   |>ā‚    
    (my_function, args) |>ā‚
    (reshape, size(a))  |>ā‚‚
    (filter, iseven)
5 Likes

Use Chain.jl. It solves a lot of these pain points.

3 Likes

Thank you, but I would prefer not to use macros from external libraries

Yea perhaps what this thread is still missing is a list of all the packages that improve upon the base pipe operator. To name a few:

11 Likes

I am curious about the reason for this.

Iā€™ve sometimes tried to use the |> from Base, but then I run into cases when it doesnā€™t work, find I donā€™t understand why, and give up.

The examples in Chain.jl readme probably explain why, not 100% sure.

Iā€™ve not used the ones in packages because I havenā€™t wanted to spend time to learn the differences and choose one. What criteria would I use for choosing anyway? For me it would serve best if there was a ā€community recommendationā€ to use one and not the others. (This is also true more generally.)

Thereā€™s also the minor point that to type | with a Finnish/Swedish keyboard requires option-7. The keyboard key itself has 7 and / on it.

There is an issue

https://github.com/JuliaLang/julia/issues/5571

and a proposal

https://github.com/JuliaLang/julia/pull/24990

1 Like

Iā€™m a student and I donā€™t like people reading my code to have to know about libraries that arenā€™t strictly necessary.

4 Likes