Fixing the Piping/Chaining Issue (Rev 3)

Third time’s (hopefully) the charm :wink:

Okay so in contrast with my last two proposals (1, 2) this proposal is less preachy, and more “hey check out this crazy cool thing I did and please offer feedback.” Hopefully it’s good enough to be incorporated into the language. But play with it and let me know!

I got really creative with method chaining over the past few days, and I’m really excited to share it.

Underscore Partial Application is Settled (imo)

I think the problem of partial application is essentially already solved; it's just a matter of implementation.

Whether the Julia community has the will to accept it is a different matter, but that’s up to y’all! It boils down to these three steps:

  1. Accept that underscores will be used for partial application, and not for arbitrary lambdas.
  2. Make a generalized functor for partial applicators, which underscore syntax lowers to.
  3. Implement function composition as a default fallback when calling functions on these functors.

In the last post I gave decent coverage of some ideas how this could be tackled, and I played with a bunch of examples showing how function composition addresses the desire for multiple function calls.

So now I will focus on chaining as a separate matter, completely distinct from underscore syntax. We can get some really cool results if we focus specifically on chaining.

Single Chains

Thanks to @christophE in this post, I realized that dot . expressions with curly braces like x.{f, g, h} parse! And they already have all the behaviors I want, regarding a) precedence, b) being easy *enough* to type conveniently, and c) having convenient, unclaimed syntax for specifying un-executed chainlinks. And did I mention they already parse? It’s basically perfect.

Not only that, but because these expressions re-use the same parsing machinery as matrices, even 2-dimensional collections of expressions can be specified. What could that possibly be good for?? :thinking: I thought about it, and I went to town on it. Let’s see if you agree with me.

I put my implementation in a GitHub repo MethodChains.jl so you can play along.

The Standard Demos

I’m re-using the idea of having the local keyword it to refer to the last expression’s result, and of being the default argument for function calls. (See explanation for this decision here.) Each expression is assumed to be an expression of it, or to evaluate to a callable object which will be called on it.

Of course, it works just fine for the examples we’ve encountered so far.

Basic Method Chains

x.{f, g, h}    ==  
x.{f; g; h}    ==
x.{
    f
    g
    h
} ==
x.{f}.{g}.{h}  ==  
h(g(f(x)))

Because the contents of {} already parse like the contents of [], the same behaviors regarding commas, newlines, and semicolons carry over.

Chains with Expressions of it

x.{f, it+it^2, g} == g(let y=f(x); y+y^2 end)

The Most Common Use for Chains are Short Expressions

seq.{length}      ==  length(seq)
obj.{first}.a     ==  first(obj).a
map({it^2}, obj)  ==  map(x->x^2, obj)
x.{cond ? f : g}  ==  (cond ? f : g)(x)

Examples from Pipe.jl Readme

a.{b(it...)}
a.{b(it(1,2))}
a.{b(it[3])}
(2,4).{get_angle(it...)}

Longer Chains:

Example from Chain.jl Readme

df.{
    dropmissing
    filter(:id => >(6), it)
    groupby(it, :group)
    combine(it, :age => sum)
}

Example from DataPipes.jl Readme

"a=1 b=2 c=3".{
    split,
    map({
        split(it, "=")
        (Symbol(it[1]) => parse(Int, it[2]))
    }, it),
    NamedTuple,
}

Also, if expressions in the chain are assignments, then they stay as-is and are not turned into assignments to it. This allows for declaring other variables local to the chain.

const avg = {len=it.{length}; sum(it)/len}
const stdev = {μ = it.{avg}, it.-μ, it.^2, avg, sqrt}

(1,2,3).{stdev} == 0.816496580927726

Note that chains and chainlinks are hard-scoped, so although they can access variables in their parent scope, they cannot change them.

Expressions which are not callable (e.g., :tuple, :generator, etc.) are simply assigned to it without attempting to call them or confirm that they’re expressions of it. This allows for things like this:

x = (1, 2)
x.{(a,b)=it, (; b,a)} == (b=2, a=1)

There are lots more demos in the Readme.

Multi-Chains

This is where we enter the twilight zone. Early versions of Julia took inspiration from MatLab to use curly braces {} to define arrays, but the syntax became deprecated (yet the parser remains the same). As a result, now we have an unclaimed syntax for expressions spread across two dimensions.

What does it mean to have callable objects (and expressions of it) spread across two dimensions?

My thought is, they can represent multiple chains of execution, where progress down the vertical dimension represents the passage of time, and the horizontal dimension separates one chain of execution from another. The chains can intermittently interact by: a) having their values copied across new chains, b) having their values splatted across new chains, c) having their values discarded, or d) having their values slurped into a collection. For (d), I reserve a new local keyword them which collects all the chains’ it values into a tuple.

Here’s an example showing splatting and slurping:

(a, b, c).{
    it...
    f       g       h
    g       h       f
    h       f       g
    it+3    it*2    it+1    
    them
}

The result is the tuple (h(g(f(a)))+3, f(h(g(b)))*2, g(f(h(c)))+1). First (a, b, c) is splatted out across the columns, then each column performs its own operations, and the results are collected into a tuple them which is then returned.

I’ve implemented these rules:

  1. A background chain, which maintains its local it and them values, is always present throughout the life of a multi-chain. Any local variables in the background chain are available to sub-chains (except for it, because each subchain declares its own local it which shadows the background chain’s).

  2. Any time that more than one column is present, every column gets its own local subchain. The local subchain maintains its own it value, and can also have other local variables.
    – Note that sub-chains have hard local scopes, so they can’t alter the background chain’s local variables and they can’t alter each others’.

  3. If the next row has a different number of columns than the last row, or if the last row has a splat ... to spread an iterable object across the columns, or if the next row has a them to collect all the previous column’s it values, then all chains are halted (and their local variables discarded), an inventory is taken of all of their it values (which are collected into a tuple them), and execution resumes with whatever new chains there are.
    – Fun note: the values can be collected and re-splatted with them...

  4. When all sub-chains halt (which they must, in order for a single object to be returned), the background chain is resumed. By default its it value takes the left-most subchain’s it value, but them can be used to slurp up all the other subchains’ values into a tuple and return that instead.

Here are a couple examples:

Compute Standard Deviation, Variance, and Maximum Absolute Deviation

julia> @btime (0:10...,).{
           avg = {len=it.{length}, sum(it)/len}
           μ = it.{avg}
           it .- μ

         # stdev     var      mad
           it.^2     it.^2    abs.(it)
           avg       avg      maximum
           sqrt      _        _
           them
       }
  1.000 ns (0 allocations: 0 bytes)
(3.1622776601683795, 10.0, 5.0)

The first two lines declared local variables for the background chain, then the third line set the background chain’s it value to something new. Then, when three columns appeared, the background chain’s it value was copied across, and each chain carried out its own instructions on it. For chains with no more commands, _ is used to denote that the chain should do nothing, but not be halted or have its value discarded. On the last line, them collects all the chains’ values into a single tuple.

FFT

Can we implement functions which are graphically arranged to more closely match the underlying algorithm?

const toy_fft = {
    # setup
    Vector{ComplexF64}
    n = it.{length}
    if n == 2 return [it[1]+it[2]; it[1]-it[2]] else it end # base case
    W = exp(-2π*im/n)

    # butterfly
    it[1:2:end-1].{toy_fft}   it[2:2:end].{toy_fft}
    _                         it.*W.^(0:n÷2-1)
#   ⋮        ⋱                ⋰         ⋮
                (x1,x2)=them
#   ⋮        ⋰                ⋱         ⋮
    [x1.+x2          ;            x1.-x2]::Vector{ComplexF64}
}

x = rand(1024)
X = x.{toy_fft}

Here, toy_fft is a recursive function that receives a single argument. The first line sets the collection type to a vector of complex numbers; the second line sets a local variable; the third line sets a base case return value; and the fourth line sets another local variable. From there, the background chain’s it value (which has so far been unchanged) is copied across two chains. Each chain takes its own slice of it (evens or odds), makes a recursive call on it, and the right chain applies phase-shifting twiddle factors. The line (x1,x2)=them collects the results and gives them convenient names, and then a butterfly is implemented on the final line and the results are concatenated for return.

Naturally, I tried calling this function on a tuple but it doesn’t work:

julia> (1, 2, 3, 4).{toy_fft}
ERROR: MethodError: no method matching Vector{ComplexF64}(::NTuple{4, Int64})

The keyword it sure comes in handy!

julia> (1, 2, 3, 4).{[it...], toy_fft}
4-element Vector{ComplexF64}:
                10.0 + 0.0im
                -2.0 + 2.0im
                -2.0 + 0.0im
 -1.9999999999999998 - 2.0im

Summary

I’ve decided to make a go of using dot . syntax and curly braces {} for a method chaining syntax, and the result has been quite amazing and a lot of fun. The expressive flexibility of curly brace syntax enables chaining to generalize into entire function definitions with parallel tasks, while still making the simplest common things like accessing obj.{first}.a work a treat.

Please play with MethodChains.jl and let me know your thoughts! I feel like this is a pretty good candidate for what curly brace syntax should be used for in the language, and help settle people’s desire to relegate it to block delimiting.

I think the aspect I’m least settled on is how to manage the merging and splitting of chains. I haven’t worked it out, but I think you can implement arbitrary directed acyclic graphs of signal flows. However, any time a chain starts or halts, all chains halt and an inventory is taken before starting up again. This makes it straightforward to reason about (esp. w.r.t. how to manage multiple expressions accessing them, or multiple splats in the same row), but there might be some use case I’m overlooking for which this is too constraining. Also, the default multi-chain alignment is left: so if a new chain begins (and if the previous row did not splat) the right-most value is copied over; and if a chain stops the right-most chain value is dropped. And in the end, when all the sub-chains stop, the left-most chain value is returned (unless they’re first collected with them).

The current behavior can be exemplified by the following implementation of a n=16 length FFT:

@mc const fft2 = { 
    (x1,x2)=it
    (x1+x2, x1-x2)
}

@mc const fft4 = {
    it[1:2:3].{fft2}...   it[2:2:4].{fft2}...
    _       _       it*1        -it*im;
    (x1,x2)=(them[1:2], them[3:4])
    (x1.+x2..., x1.-x2...)
}

@mc const fft8 = {
    ∠ = √-im
    it[1:2:7].{fft4}...   it[2:2:8].{fft4}...
    _   _   _   _   it*1    it*∠  -it*im  it*∠^3
    (x1,x2)=(them[1:4], them[5:8])
    (x1.+x2..., x1.-x2...) 
}

@mc const fft16 = {
    ∠ = √-im; ∠∠ = √∠
    it[1:2:15].{fft8}...  it[2:2:16].{fft8}...
    _   _   _   _   _   _   _   _   it*1   it*∠∠   it*∠   it*∠∠^3   -it*im   it*∠∠^5   it*∠^3   it*∠∠^7
    (x1,x2)=(them[1:8], them[9:16])
    (x1.+x2..., x1.-x2...)
}

In the definition of fft16, the result of calling fft8 on an even or odd slice of it is splatted across the next row; the left eight are left unchanged, while the right eight are twiddled by various fractions of a half complex unit circle. Then, slices of them are collected and assigned to x1 and x2 for easy broadcasted arithmetic, before being re-splatted into a tuple.

All that to say, the current arrangement seems to have its merits and is easy enough to reason about; I feel like it’s pretty good. But I don’t know if it’s the best possible option or not. Your thoughts are very welcome.

GLHF!

33 Likes

This looks really awesome! Could be interesting to define your own it/them kind of like the Lazy.jl @as macro; I think syntax highlighting of “it”/“them” would be huge for readability. I need to take a little more time to play around and digest this but I’m excited! I like the idea of building around function composition.

1 Like

Thanks interesting read.

2 Likes

Looks promising. It’s similar to uniform function call syntax (known from D and Nim) and at the same time provides the monadic semantics that I would have used for a functional programming language. It also provides variable names during chains. I like the use of it and them. I am sorry, that I didn’t respond to your last message. I wanted to leave it to the people whether and how they use my ideas or not.

It looks better and less noisy than stacking lambda expressions together with pipes. The whitespace separation is something, one would need to get used to but makes sense. I was already waiting for modern languages to be extended into 2D instead of 1D. The only big problem would be the (optional) alignment of the whitespace separated cells. Aligning with spaces is bad for maintenance and other options (such as alignment control characters, what a tab actually should have been) usually are not supported by text editors (as far as I know).

What is actually the reason of using _ instead of just it? Would it be interpreted as it(it)?

1 Like

Now this is getting really interesting … in the end, nested function calls - with or without named arguments - can be represented as a DAG. Thus, you can think of the functions as nodes/boxes which are connected into more or less complicated call graphs via edges/wires.
Almost all approaches I have seen to simplify working with complicated function chains run into the shortcomings of linear syntax, i.e., assuming a 1-dimensional textual notation. Here are some links on previous art, e.g., using hash maps (dictionaries) for composing functions via named arguments Prismatic’s “Graph” or using higher-order combinators (binary functions) for expressing call chains with non-trivial wiring Arrows.
Using the last example from the Arrows link we would get the following:

f &&& g >>> arr (\ (y, z) -> y + z)   # Haskell arrows

{:a (fnk [x] (f x)), :b (fnk [x] (g x)), :c (fnk [a b] (+ a b))}  ;; Prismatic graph with input :x

x.{        # Your notation
  f         g
  +(them...)
}

Guess that thinking about call graphs as DAG and coming up with a good 2-dimensional notation could be a real revelation. Your approach already covers much ground here, in particular being able to distribute and recollect arguments is very useful … I wonder if more complicated rules like “get the value from 2 steps up and three chains to the left” or something would add much expressive power or be required at all?

3 Likes

Super cool. Is it compatible with broadcast? i.e. if I do like

m = {f, g, h}
mapped_arr = m.(arr)

The only thing I’m not crazy about is the choice of keyword it. I would much prefer the underscore or the dashed box thing.

Another idea is to merge it with the threading / partial application by allowing to apply to everything inside {}. I.e. I had seen someone on slack wishing there were a way to pass rng into every function in a chain like this

@pipe g |> f1(_, rng) |> f2(_, rng) |> f3(_, rng)

Perhaps with the new syntax this could be

@mc g.{f1, f2, f3}(⬚, rng)
2 Likes

It’s a brilliant idea to interpret visually “piping” as a vector (or columns of a matrix) of functions. Many thanks for your efforts and I think it’s a real step forward to the final solution.
Just my two cents: I guess using |> instead of dot would be a little better for compatibility with Base, i.e.

 x |> {f, g}
4 Likes

Hm, interesting idea. My approach here has been to use local pronouns and rely on position instead of using proper names, but this is basically giving things a local nickname.

I think my main reservation is that nicknames only work for fluent interfaces, where the chained object maintains the same type throughout the call chain. For example, when I buy an orange, and I bring it home, and I peel it, and I split it, it’s still an orange (more or less). That’s a fluent call chain.

But not all call chains are fluent. When I chew the orange into a pulp, and I swallow it, and it dissolves in my stomach acid, and it enters my bloodstream, and the mitochondria (the powerhouse of the cell!) within my muscles use its energy to convert adenosine diphosphate into adenosine triphosphate, continuing to call it an orange becomes a bit … strained.

But I should think about it more. Names are useful sometimes.

Nope, it just becomes it=it (at least it did, until I special-cased it so that it just deletes the expression)! For lines where a chain doesn’t do anything though, _ was less visually noisy so I lumped it into the special case. It’s like when the professor writes a long expression on the chalkboard, and on the next line writes two tick-marks to indicate “same as above.”

I’m not settled on using _ as a stand-in for “do nothing here,” because I don’t know what Julia will eventually decide to do with _, but it has the right feel to it. In the concept where _ denotes partial application, which is my hope, my intuition for a single _ on its own feels like it is requesting the identity function, which is perfect here.

The character I really wanted was (\vdots), but it’s a binary operator in the language so the parser doesn’t allow me to leave it in a column by itself. Drats!

I already experienced this when making my toy demo code; managing alignment is a hassle. I think the issue though, is that IDEs haven’t been made to offer help in arranging 2D code, because such code doesn’t really exist. (Why should I invest time and effort making a tool to help people do something they can’t do?) It’s a catch-22.

But sometimes it’s just easier to reason about a problem graphically, so you might be willing to tough it out even without the aid of tooling. I experienced this too when making my toy demo code.

Eventually the IDE tooling will catch up, if it’s accepted as a feature of the language.

Yup! This works:

julia> {it+1, abs2}.((0,1,2,3))
(1, 4, 9, 16)

Or this:

julia> (0,1,2,3).{{it+1, abs2}.(it)}
(1, 4, 9, 16)

And after some tweaking, this works too now:

julia> (0,1,2,3) .|> {it+1, abs2}
(1, 4, 9, 16)

I don’t have any special syntax for making a chain that broadcasts though. I have mixed feelings about that.

Indeed, it’s unfortunate that my choice of the pronouns it and them is a little bit verbose compared with using special characters. I think this is similar to Julia’s choice to use English words begin and end for block delimiting, unlike the C-style languages which use curly braces (and for which Julia receives endless criticism, along with 1-based indices—another critique which I think is overblown).

I think this becomes a matter of preference, e.g.:

My goal is to choose keywords or characters with the correct semantic meaning, ideally as short as possible, with a bias toward using the style of the language. I don’t want to claim _, because (due to hard local scoping) I have the liberty of using any valid identifier name, so it would be a waste to claim it. And I’m not certain if the dashed box has the correct semantic meaning for this. So for now, my bias is to use it and them, as the choice seems well-aligned with Julia’s style to use common English words.

I’m open to ideas; I’m looking for a compelling argument though. I’ve also thought about me and us (wow this got romantic all of a sudden), but that seems a little … what’s the word … cheesy? :sweat_smile:

This syntax already operates as:

julia> (+).{y->x->y(x,2)}(1)
3

Also, I haven’t encountered too many situations when a chain of function calls takes exactly the same set of arguments except for the one which is being threaded through, so it seems wasteful to devote dedicated syntax to it. In this context, it seems more reasonable to either redefine the functions to work better together (it sounds like they come from the same library, so they should be rewritten to compose better), or overload them to take a tuple of the object and the range, or use a custom macro for this specific scenario. For example, using overloading to help them compose better:

f1(obj,rng) = obj.+rng; f2(obj,rng) = obj.-rng; f3(obj,rng) = obj.*rng;
for f ∈ (:f1, :f2, :f3) eval(:( $f(t::Tuple{Any, AbstractRange}) = ($f(t[1], t[2]), t[2]) )) end

x = 10rand(10); rng = 1:10;
x.{f1(it,rng), f2(it,rng), f3(it,rng)} == (x,rng).{f1, f2, f3}[1] # true

When I introduce the keyword it, it’s indeed to serve in glue logic to help functions compose that weren’t originally meant to compose. That’s real life—not everything is written to be composable. But if these functions all come from the same package, the package authors can do better imo.

This works:

julia> im.{sin,cos}
1.7737756783403529 - 0.0im

julia> im |> {sin,cos}
1.7737756783403529 - 0.0im

However, using |> here a) causes reduction in performance due to creating a lambda which is compiled and immediately discarded after use, b) uses awkward-to-type characters, and c) has lower precedence, so that if you want to access a property or index of the returned object, or otherwise manipulate it, you have to enclose the entire expression in parentheses. For example:

julia> @time im.{sin,cos}
  0.000003 seconds
1.7737756783403529 - 0.0im

julia> @time im |> {sin,cos}
  0.022731 seconds (2.92 k allocations: 145.515 KiB, 99.51% compilation time)
1.7737756783403529 - 0.0im

julia> im.{sin,cos}+1
2.773775678340353 - 0.0im

julia> im |> {sin,cos}+1
ERROR: MethodError: no method matching +(::var"#ChainLink#4", ::Int64)

I generally don’t see |> as a good chaining operator; the only thing it really shines at is pairing with println, imo. I really don’t see it as an operator worth defending.

This is the sort of question I’m hoping for. Are there any blindspots to my proposal, or is it sufficiently expressive to do all the things we might like to do (with sufficient adjustments to the rows and columns)?

One of the things I feel like could be worthwhile, is to be able to distribute sub-chains across processing threads or computers. If you wish to access arbitrary rows and columns at any point in any chain, not only do you need to communicate a lot of information in the code to express this, but it becomes very difficult to keep track of where tasks start and end visually. The current arrangement, of having all chains start together and stop together, makes it easy to reason about this.

One concern I have is that adjusting the rows and columns can become a hassle very quickly, especially if you make a change that requires you to adjust all of them. I’m hoping that IDE tooling will be able to help here, but I will need more examples to see just how bad it can be.

Can you try to come up with things that you think this syntax will handle poorly, and I’ll try to see if I can make it work? For the example in your question, I will create some dummy code:

Suppose we have these functions:

@mc begin
f = [{it+x} for x ∈ 1:3];
g = [{it*x} for x ∈ 3:-1:1];
h = [{it^x} for x ∈ 1:3];
end

And we create a chain:

julia> (3,2,1).{
           it...
           f[1]    f[2]    f[3]
           g[1]    g[2]    g[3]
           h[1]    h[2]    h[3]
           them
       }
(12, 64, 64)
@macroexpand reveals the real structure of what's being executed

It’s not perfect (it has a couple unnecessary assignments which get compiled away), but it does the trick:

julia> @macroexpand (1,2,3).{
           it...
           f[1]    f[2]    f[3]
           g[1]    g[2]    g[3]
           h[1]    h[2]    h[3]
           them
       }
:(let it = (1, 2, 3), them = (it,)
      them = (it...,)
      them = (let it = them[1]
                  it = (f[1])(it)
                  it = (g[1])(it)
                  it = (h[1])(it)
              end, let it = them[2]
                  it = (f[2])(it)
                  it = (g[2])(it)
                  it = (h[2])(it)
              end, let it = them[3]
                  it = (f[3])(it)
                  it = (g[3])(it)
                  it = (h[3])(it)
              end)
      it = them[1]
      it = them
      it
  end)

Now, suppose we wish to access the result of the top-left call f[1], and include it in the output tuple. We can do this with:

julia> (3,2,1).{
           it...
           f[1]    f[2]    f[3]
           g[1]    g[2]    g[3]    them[1]
           h[1]    h[2]    h[3]    _
           them
       }
(12, 64, 64, 4)

If we then wish to add it to the result of calling h[2], we can do this:

julia> (3,2,1).{
           it...
           f[1]    f[2]        f[3]
           g[1]    g[2]        g[3]    them[1]
           h[1]    h[2]        h[3]    _
           _       it+them[4]  _       _
           them
       }
(12, 68, 64, 4)

I change my mind, we can discard the final value:

julia> (3,2,1).{
           it...
           f[1]    f[2]        f[3]
           g[1]    g[2]        g[3]    them[1]
           h[1]    h[2]        h[3]    _
           _       it+them[4]  _       _
           them[1:3]
       }
(12, 68, 64)

Now, the fact that accessing them[1] caused all the chains to cease, their values to be collected, and then and re-start, might be inconvenient if we’re multi-threading (or if we had other local variables in the chains we didn’t want discarded).

Calling @macroexpand shows how all three chains have been interrupted.
julia> @macroexpand julia> (3,2,1).{                                                                                                                             
                  it...                                                                                                                                          
                  f[1]    f[2]        f[3]                                                                                                                       
                  g[1]    g[2]        g[3]    them[1]                                                                                                            
                  h[1]    h[2]        h[3]    _                                                                                                                  
                  _       it+them[4]  _       _                                                                                                                  
                  them[1:3]                                                                                                                                      
              }
:(julia > let it = (3, 2, 1), them = (it,)
          them = (it...,)
          them = (let it = them[1]
                      it = (f[1])(it)
                  end, let it = them[2]
                      it = (f[2])(it)
                  end, let it = them[3]
                      it = (f[3])(it)
                  end)
          it = them[1]
          them = (let it = them[1]
                      it = (g[1])(it)
                      it = (h[1])(it)
                  end, let it = them[2]
                      it = (g[2])(it)
                      it = (h[2])(it)
                  end, let it = them[3]
                      it = (g[3])(it)
                      it = (h[3])(it)
                  end, let it = them[3]
                      it = them[1]
                  end)
          it = them[1]
          them = (let it = them[1]
                      it
                  end, let it = them[2]
                      it = it + them[4]
                  end, let it = them[3]
                      it
                  end, let it = them[4]
                      it
                  end)
          it = them[1]
          it = them[1:3]
          it
      end)

We would like only the left-most chain to stop so we put f[1] in a row by itself:

julia> (3,2,1).{
           it...
           f[1]    _           _
           _       f[2]        f[3]    them[1]
           g[1]    g[2]        g[3]    _
           h[1]    h[2]        h[3]    _
           _       it+them[4]  _       _
           them[1:3]
       }
(12, 68, 64)
This is what running `@macroexpand` on this looks like.
julia> @macroexpand julia> (3,2,1).{                                                                                                                             
                  it...                                                                                                                                          
                  f[1]    _           _                                                                                                                          
                  _       f[2]        f[3]    them[1]                                                                                                            
                  g[1]    g[2]        g[3]    _                                                                                                                  
                  h[1]    h[2]        h[3]    _                                                                                                                  
                  _       it+them[4]  _       _                                                                                                                  
                  them[1:3]                                                                                                                                      
              }
:(julia > let it = (3, 2, 1), them = (it,)
          them = (it...,)
          them = (let it = them[1]
                      it = (f[1])(it)
                  end, let it = them[2]
                      it
                  end, let it = them[3]
                      it
                  end)
          it = them[1]
          them = (let it = them[1]
                      it = (g[1])(it)
                      it = (h[1])(it)
                  end, let it = them[2]
                      it = (f[2])(it)
                      it = (g[2])(it)
                      it = (h[2])(it)
                  end, let it = them[3]
                      it = (f[3])(it)
                      it = (g[3])(it)
                      it = (h[3])(it)
                  end, let it = them[3]
                      it = them[1]
                  end)
          it = them[1]
          them = (let it = them[1]
                      it
                  end, let it = them[2]
                      it = it + them[4]
                  end, let it = them[3]
                      it
                  end, let it = them[4]
                      it
                  end)
          it = them[1]
          it = them[1:3]
          it
      end)

Overall, I felt like this wasn’t a hassle at all. Although, it seems like running @macroexpand will be useful for inspecting the code when it’s eventually desired to make it performant.

I want to keep playing with examples to see if there’s anything it handles really poorly. So far I’ve been too happy with it.

2 Likes

By x |> {f, g} I mean a new syntax which means evaluating g(f(x)) (no need to create a lambda), and don’t use the the SINGLE_CHAIN_LINK in your implementation.

Again I like the idea that braces {} means piping. However there are few ways to implement it, like

x.{f,g}  # MethodChains.jl
x |> {f, g} # more compatible with Base
# or even just
x{f,g} # already parse too
# or some others ...
2 Likes

Ok, here is an example of a graph from the Prismatic page:

(def stats-graph
  {:n  (fnk [xs]   (count xs))
   :m  (fnk [xs n] (/ (sum identity xs) n))
   :m2 (fnk [xs n] (/ (sum #(* % %) xs) n))
   :v  (fnk [m m2] (- m2 (* m m)))})

It works in your notation, but does not feel very natural as most functions take two arguments:

julia> function stats_graph(xs)
       @mc xs.{
           it                    length(it)
           sum(them[1])/them[2]  sum(x->x^2, them[1])/them[2]
           them[2]-them[1]^2
       }
       end
stats_graph (generic function with 1 method)

julia> stats_graph([1, 2, 3, 6])
3.5

In case of pure functions, once the dependencies between functions are known parallelization and other optimizations can be done quite freely by re-ordering functions when possible. It seems that in your current implementation them acts like a synchronization point. In case of side-effects, the order of evaluation matters and in particular such synchronization points can lead to visible differences:

julia> f1(x) = @show(x + 1)
f1 (generic function with 1 method)

julia> f2(x) = @show(x + 2)
f2 (generic function with 1 method)

julia> g(x) = @show(2 * x)
g (generic function with 1 method)

julia> @mc (2, 3).{
           it...
           f1     g
           f2     _
           them  # f1, f2 are evaluated before g
       }
x + 1 = 3
x + 2 = 5
2x = 6
(5, 6)

julia> @mc (2, 3).{
           it...
           f1     g
           them...  # f1 and g evaluated here
           f2     _
           them
       }

Note that the structuring of side-effects is the main point of Arrows in Haskell. Also for MethodChains it should probably be part of the semantics, i.e., @mc always evaluates left-to-right and top-to-bottom whereas @mct (method chain threaded) is free to parallelize/re-order between synchronization points.

So you’re requesting that I offer assistance to a language feature which I don’t even like (for the other two reasons I’ve mentioned)?

and then you suggest to take syntax which has already been claimed in Base Julia? (admittedly I like this syntax best, but that ship sailed a long time ago.)

Help me understand the nature of your protest. Is it simply that you dislike the visual appearance of x.{f,g}? Is it that it looks too similar to broadcasting? Or?

I think this is the more natural way to structure this:

@mc function stats_graphs(xs)
    xs.{
        len = it.{length}
        sum(it)/len         sum({it^2}, it)/len
        them[2]-them[1]^2
    }
end

So in this circumstance, acknowledging that both chains will immediately use this computed value encourages us to place its computation upfront, outside of the chains, which seems like a positive development.

Not a bad take. I will note that the objective is to find syntax worthy of incorporation into the base language (or, if that’s too ambitious a goal for multi-chains, at least I’d like to see how close we can get), so that you wouldn’t have to type @mc anymore.

My tentative plan had been to place a macro call inside the multi-chain for multi-threading or distributing, like:

function stats_graphs(xs)
    xs.{
        len = it.{length}
        Threads.@threads # multi-thread the columns that start on the next line
        sum(it)/len         sum({it^2}, it)/len
        them[2]-them[1]^2
    }
end

On the other hand, if I offer no guarantees of execution order, then it will discourage side effects and that could be a good thing if it gives the compiler more liberty to optimize. This view takes the left to right execution order as merely an artifact of how I chose to implement it.

1 Like

Could one define a macro, say @m, in a package such that for x is not a Type

@m x{f,g}

would evaluate the Expr below?

:(let it = x
      it = f(it)
      it = g(it)
  end)

I was trying to explore alternatives and no protest here.

This is a really exciting approach, and the syntax obj.{first}.a is a really elegant solution, in my opinion, to providing “object-like” method invocation (similar to Java or python) without implying that the methods are members of the object itself. Very excited to try this out!

6 Likes

Ah, gotcha! My primary constraint here is to find syntax which isn’t claimed, because my hope is to have a [good] chaining syntax implemented as a proper feature in the language, not constrained to macro calls. In my last two proposals I was adamant about this (and I gave my reasons)—I should have re-emphasized it; my mistake.

I’m only using macros here because that’s the only way I can implement it for trial and experimentation. But if it becomes part of the language, we won’t have to type @mc or call any macro at all.

Thanks! Let me know how it goes!

2 Likes

Update: pretty big overhaul to multi-chain code for bug fixes, and fixed chain local variable scope (added local statements to all assignments.). Also added an experimental local keyword loop.

Let’s see if anybody can figure out what this does…

open("./input/1.txt") do io
    io.{
        read(it, String)
        split(it, "\n\n")
        map({
            split
            map({try parse(Int, it) catch; 0 end}, it)
            sum
        }, it)
        sort                       ;@show it
        last        last(it,3)     ;@show(top=them)...
        _           sum
        (top_one, sum_top_three)=them
        (; top_one, sum_top_three)
    }
end
1 Like

IMHO, seems pretty readable :slight_smile:

My understanding

This sums numbers which are grouped by “paragraphs”, while catching parsing errors. One number per line, and paragraphs separated by two new lines.

  1. Read the file
  2. Split by paragraphs
  3. Sums each one
  4. Show the sorted list of sums
  5. Show the top value (last) & the top 3
  6. Assign to variables
  7. Returns

The hard part (so to speak) was to infer the file’s structure. But probably this would’ve been the same with classical syntax.

1 Like

I know this is probably a vanity complaint, but having more braces in the language does not sound like a good idea to me. Julia having relatively few different braces and meanings associated with them makes it much easier to read for me - I’m already having a hard time distinguishing the examples in this thread, with ({ and similar close together (and I think my eyesight is still pretty good - I don’t have glasses). Julia not having curly braces is part of the reason I like the language in the first place.

I’ve also noticed that over the past 4(!) threads, the people who comment/engage have shifted quite a bit and I think this has become a bit of an echo chamber. I of course don’t know why exactly that is, but I suspect that people generally moved on because they just don’t care too much about it because the macro based solutions from the various packages are sufficient for their usecase and such a big, new feature just doesn’t add enough for them to continue engaging with it.

2 Likes

On a more concrete note - while it and them works well in a macro, it doesn’t work as well on the language level. If curly brace syntax is supposed to be equivalent to anonymous functions (with implicit argument lists), what happens if there’s already a it = ... in the block containing it? Is there capturing? No capturing? This is the problem with adding new keywords and ultimately why the proposals in the past used _, because that can’t be used as a variable name. This is not a problem in a macro because there it’s clear that everything is a DSL, with different semantics than in the base language (though lots of macros settled on the underscore anyway, to not be ambiguous with Base).

Overall though, this just feels much closer to being a new language than julia to me.

8 Likes