Fixing the Piping/Chaining Issue (Rev 3)

uniment · November 27, 2022, 11:52am

Hm, interesting idea. My approach here has been to use local pronouns and rely on position instead of using proper names, but this is basically giving things a local nickname.

I think my main reservation is that nicknames only work for fluent interfaces, where the chained object maintains the same type throughout the call chain. For example, when I buy an orange, and I bring it home, and I peel it, and I split it, it’s still an orange (more or less). That’s a fluent call chain.

But not all call chains are fluent. When I chew the orange into a pulp, and I swallow it, and it dissolves in my stomach acid, and it enters my bloodstream, and the mitochondria (the powerhouse of the cell!) within my muscles use its energy to convert adenosine diphosphate into adenosine triphosphate, continuing to call it an orange becomes a bit … strained.

But I should think about it more. Names are useful sometimes.

Nope, it just becomes it=it (at least it did, until I special-cased it so that it just deletes the expression)! For lines where a chain doesn’t do anything though, _ was less visually noisy so I lumped it into the special case. It’s like when the professor writes a long expression on the chalkboard, and on the next line writes two tick-marks to indicate “same as above.”

I’m not settled on using _ as a stand-in for “do nothing here,” because I don’t know what Julia will eventually decide to do with _, but it has the right feel to it. In the concept where _ denotes partial application, which is my hope, my intuition for a single _ on its own feels like it is requesting the identity function, which is perfect here.

The character I really wanted was ⋮ (\vdots), but it’s a binary operator in the language so the parser doesn’t allow me to leave it in a column by itself. Drats!

I already experienced this when making my toy demo code; managing alignment is a hassle. I think the issue though, is that IDEs haven’t been made to offer help in arranging 2D code, because such code doesn’t really exist. (Why should I invest time and effort making a tool to help people do something they can’t do?) It’s a catch-22.

But sometimes it’s just easier to reason about a problem graphically, so you might be willing to tough it out even without the aid of tooling. I experienced this too when making my toy demo code.

Eventually the IDE tooling will catch up, if it’s accepted as a feature of the language.

Yup! This works:

julia> {it+1, abs2}.((0,1,2,3))
(1, 4, 9, 16)

Or this:

julia> (0,1,2,3).{{it+1, abs2}.(it)}
(1, 4, 9, 16)

And after some tweaking, this works too now:

julia> (0,1,2,3) .|> {it+1, abs2}
(1, 4, 9, 16)

I don’t have any special syntax for making a chain that broadcasts though. I have mixed feelings about that.

Indeed, it’s unfortunate that my choice of the pronouns it and them is a little bit verbose compared with using special characters. I think this is similar to Julia’s choice to use English words begin and end for block delimiting, unlike the C-style languages which use curly braces (and for which Julia receives endless criticism, along with 1-based indices—another critique which I think is overblown).

I think this becomes a matter of preference, e.g.:

My goal is to choose keywords or characters with the correct semantic meaning, ideally as short as possible, with a bias toward using the style of the language. I don’t want to claim _, because (due to hard local scoping) I have the liberty of using any valid identifier name, so it would be a waste to claim it. And I’m not certain if the dashed box has the correct semantic meaning for this. So for now, my bias is to use it and them, as the choice seems well-aligned with Julia’s style to use common English words.

I’m open to ideas; I’m looking for a compelling argument though. I’ve also thought about me and us (wow this got romantic all of a sudden), but that seems a little … what’s the word … cheesy?

This syntax already operates as:

julia> (+).{y->x->y(x,2)}(1)
3

Also, I haven’t encountered too many situations when a chain of function calls takes exactly the same set of arguments except for the one which is being threaded through, so it seems wasteful to devote dedicated syntax to it. In this context, it seems more reasonable to either redefine the functions to work better together (it sounds like they come from the same library, so they should be rewritten to compose better), or overload them to take a tuple of the object and the range, or use a custom macro for this specific scenario. For example, using overloading to help them compose better:

f1(obj,rng) = obj.+rng; f2(obj,rng) = obj.-rng; f3(obj,rng) = obj.*rng;
for f ∈ (:f1, :f2, :f3) eval(:( $f(t::Tuple{Any, AbstractRange}) = ($f(t[1], t[2]), t[2]) )) end

x = 10rand(10); rng = 1:10;
x.{f1(it,rng), f2(it,rng), f3(it,rng)} == (x,rng).{f1, f2, f3}[1] # true

When I introduce the keyword it, it’s indeed to serve in glue logic to help functions compose that weren’t originally meant to compose. That’s real life—not everything is written to be composable. But if these functions all come from the same package, the package authors can do better imo.

This works:

julia> im.{sin,cos}
1.7737756783403529 - 0.0im

julia> im |> {sin,cos}
1.7737756783403529 - 0.0im

However, using |> here a) causes reduction in performance due to creating a lambda which is compiled and immediately discarded after use, b) uses awkward-to-type characters, and c) has lower precedence, so that if you want to access a property or index of the returned object, or otherwise manipulate it, you have to enclose the entire expression in parentheses. For example:

julia> @time im.{sin,cos}
  0.000003 seconds
1.7737756783403529 - 0.0im

julia> @time im |> {sin,cos}
  0.022731 seconds (2.92 k allocations: 145.515 KiB, 99.51% compilation time)
1.7737756783403529 - 0.0im

julia> im.{sin,cos}+1
2.773775678340353 - 0.0im

julia> im |> {sin,cos}+1
ERROR: MethodError: no method matching +(::var"#ChainLink#4", ::Int64)

I generally don’t see |> as a good chaining operator; the only thing it really shines at is pairing with println, imo. I really don’t see it as an operator worth defending.

This is the sort of question I’m hoping for. Are there any blindspots to my proposal, or is it sufficiently expressive to do all the things we might like to do (with sufficient adjustments to the rows and columns)?

One of the things I feel like could be worthwhile, is to be able to distribute sub-chains across processing threads or computers. If you wish to access arbitrary rows and columns at any point in any chain, not only do you need to communicate a lot of information in the code to express this, but it becomes very difficult to keep track of where tasks start and end visually. The current arrangement, of having all chains start together and stop together, makes it easy to reason about this.

One concern I have is that adjusting the rows and columns can become a hassle very quickly, especially if you make a change that requires you to adjust all of them. I’m hoping that IDE tooling will be able to help here, but I will need more examples to see just how bad it can be.

Can you try to come up with things that you think this syntax will handle poorly, and I’ll try to see if I can make it work? For the example in your question, I will create some dummy code:

Suppose we have these functions:

@mc begin
f = [{it+x} for x ∈ 1:3];
g = [{it*x} for x ∈ 3:-1:1];
h = [{it^x} for x ∈ 1:3];
end

And we create a chain:

julia> (3,2,1).{
           it...
           f[1]    f[2]    f[3]
           g[1]    g[2]    g[3]
           h[1]    h[2]    h[3]
           them
       }
(12, 64, 64)

@macroexpand reveals the real structure of what's being executed

It’s not perfect (it has a couple unnecessary assignments which get compiled away), but it does the trick:

julia> @macroexpand (1,2,3).{
           it...
           f[1]    f[2]    f[3]
           g[1]    g[2]    g[3]
           h[1]    h[2]    h[3]
           them
       }
:(let it = (1, 2, 3), them = (it,)
      them = (it...,)
      them = (let it = them[1]
                  it = (f[1])(it)
                  it = (g[1])(it)
                  it = (h[1])(it)
              end, let it = them[2]
                  it = (f[2])(it)
                  it = (g[2])(it)
                  it = (h[2])(it)
              end, let it = them[3]
                  it = (f[3])(it)
                  it = (g[3])(it)
                  it = (h[3])(it)
              end)
      it = them[1]
      it = them
      it
  end)

Now, suppose we wish to access the result of the top-left call f[1], and include it in the output tuple. We can do this with:

julia> (3,2,1).{
           it...
           f[1]    f[2]    f[3]
           g[1]    g[2]    g[3]    them[1]
           h[1]    h[2]    h[3]    _
           them
       }
(12, 64, 64, 4)

If we then wish to add it to the result of calling h[2], we can do this:

julia> (3,2,1).{
           it...
           f[1]    f[2]        f[3]
           g[1]    g[2]        g[3]    them[1]
           h[1]    h[2]        h[3]    _
           _       it+them[4]  _       _
           them
       }
(12, 68, 64, 4)

I change my mind, we can discard the final value:

julia> (3,2,1).{
           it...
           f[1]    f[2]        f[3]
           g[1]    g[2]        g[3]    them[1]
           h[1]    h[2]        h[3]    _
           _       it+them[4]  _       _
           them[1:3]
       }
(12, 68, 64)

Now, the fact that accessing them[1] caused all the chains to cease, their values to be collected, and then and re-start, might be inconvenient if we’re multi-threading (or if we had other local variables in the chains we didn’t want discarded).

Calling @macroexpand shows how all three chains have been interrupted.

julia> @macroexpand julia> (3,2,1).{                                                                                                                             
                  it...                                                                                                                                          
                  f[1]    f[2]        f[3]                                                                                                                       
                  g[1]    g[2]        g[3]    them[1]                                                                                                            
                  h[1]    h[2]        h[3]    _                                                                                                                  
                  _       it+them[4]  _       _                                                                                                                  
                  them[1:3]                                                                                                                                      
              }
:(julia > let it = (3, 2, 1), them = (it,)
          them = (it...,)
          them = (let it = them[1]
                      it = (f[1])(it)
                  end, let it = them[2]
                      it = (f[2])(it)
                  end, let it = them[3]
                      it = (f[3])(it)
                  end)
          it = them[1]
          them = (let it = them[1]
                      it = (g[1])(it)
                      it = (h[1])(it)
                  end, let it = them[2]
                      it = (g[2])(it)
                      it = (h[2])(it)
                  end, let it = them[3]
                      it = (g[3])(it)
                      it = (h[3])(it)
                  end, let it = them[3]
                      it = them[1]
                  end)
          it = them[1]
          them = (let it = them[1]
                      it
                  end, let it = them[2]
                      it = it + them[4]
                  end, let it = them[3]
                      it
                  end, let it = them[4]
                      it
                  end)
          it = them[1]
          it = them[1:3]
          it
      end)

We would like only the left-most chain to stop so we put f[1] in a row by itself:

julia> (3,2,1).{
           it...
           f[1]    _           _
           _       f[2]        f[3]    them[1]
           g[1]    g[2]        g[3]    _
           h[1]    h[2]        h[3]    _
           _       it+them[4]  _       _
           them[1:3]
       }
(12, 68, 64)

This is what running `@macroexpand` on this looks like.

julia> @macroexpand julia> (3,2,1).{                                                                                                                             
                  it...                                                                                                                                          
                  f[1]    _           _                                                                                                                          
                  _       f[2]        f[3]    them[1]                                                                                                            
                  g[1]    g[2]        g[3]    _                                                                                                                  
                  h[1]    h[2]        h[3]    _                                                                                                                  
                  _       it+them[4]  _       _                                                                                                                  
                  them[1:3]                                                                                                                                      
              }
:(julia > let it = (3, 2, 1), them = (it,)
          them = (it...,)
          them = (let it = them[1]
                      it = (f[1])(it)
                  end, let it = them[2]
                      it
                  end, let it = them[3]
                      it
                  end)
          it = them[1]
          them = (let it = them[1]
                      it = (g[1])(it)
                      it = (h[1])(it)
                  end, let it = them[2]
                      it = (f[2])(it)
                      it = (g[2])(it)
                      it = (h[2])(it)
                  end, let it = them[3]
                      it = (f[3])(it)
                      it = (g[3])(it)
                      it = (h[3])(it)
                  end, let it = them[3]
                      it = them[1]
                  end)
          it = them[1]
          them = (let it = them[1]
                      it
                  end, let it = them[2]
                      it = it + them[4]
                  end, let it = them[3]
                      it
                  end, let it = them[4]
                      it
                  end)
          it = them[1]
          it = them[1:3]
          it
      end)

Overall, I felt like this wasn’t a hassle at all. Although, it seems like running @macroexpand will be useful for inspecting the code when it’s eventually desired to make it performant.

I want to keep playing with examples to see if there’s anything it handles really poorly. So far I’ve been too happy with it.

Topic		Replies	Views
Fixing the Piping/Chaining/Partial Application Issue (Rev 2) Internals & Design proposal , piping , chaining , partial-evaluation , threading	40	4080	November 26, 2022
Summary of piping/chaining proposal? Internals & Design	23	3288	August 13, 2024
Fixing the Piping/Chaining Issue Internals & Design proposal , piping , chaining , partial-evaluation , threading	212	7432	January 16, 2023
[RFC] PipelessPipes.jl (now Chain.jl) Package Announcements	61	4723	March 25, 2021
Would the Scala convention for anonymous function arguments be feasible? Internals & Design	24	3107	December 16, 2016

Fixing the Piping/Chaining Issue (Rev 3)

Related topics