Fixing the Piping/Chaining Issue (Rev 3)

I really like the x.{f, g(it), h} syntax. It’s a nice solution to cases where pure functional programming is awkward, and the syntax feels quite easy to grasp. And yes, it would be awesome if using <tab> was as helpful as OOP languages.

However, I will say that the multi-chain syntax feels inappropriate to use directly inside a text-based programming language. This kind of 2D programming seems best suited for full-fledged visual programming languages like Blender’s graph editor, LabVIEW, Microsoft Excel, Scratch, etc. In a text-based language it just looks so out-of-place. Are there any other languages that have something like this? I also feel like spaces should never have that much power (even Python’s explicit use of spaces to demarcate a scope feels illegal). Maybe it’s just me though. e.g., I would rather the multi-chains just output 3-tuples, and each step use one of it[1], it[2], it[3].

Maybe one day, but I think it will be harder to gain traction if the multi-chain syntax is included in the proposed change, whereas with {} it seems feasible to win support for.

e.g., writing the example without the multi-chain seems fine to me

(0:10...,).{
   avg = {len=it.{length}, sum(it)/len}
   μ = it.{avg}
   it .- μ
   (it.^2,       it.^2,       abs.(it)      )
   (avg(it[1]),  avg(it[2]),  maximum(it[3]))
   (sqrt(it[1]), it[2],       it[3]         )
}

and you could space it out as you wish, instead of having the spacing control the function.

9 Likes

Agreed with @MilesCranmer . If serious about getting into Base (not that I have any say whatsoever in the matter) it may be better to add the functionality incrementally. e.g. start with “just” the chaining call syntax like x.{f, g} as sugar for (_ -> g(f(_)))(x)

Some of the other parts of the proposal, like the it keyword or the multi-chains can always be added later if there is demand for them. There is no need to bundle all the proposed functionality together.

3 Likes

I second the use of commas instead of whitespace; it looks cleaner in my opinion. (I think we should let people drop the parentheses, though, since they’re not providing any extra information.)

Using commas also solves the problem of figuring out which values to discard. If you want to drop a value, just leave the space before/after a comma blank instead of writing it or _. This also lets you avoid using two separate keywords (it and them) when just one, like _, would do.

(0:10...,).{
   mean
   _ .^2, _.^2, abs.(_)
   _ .- 3
   , _,
   sqrt(_)
}

i like this – mentally i see the dot as broadcasting. the pipe is one extra key press but i like it.

1 Like

I think I agree that I slightly prefer the pipe syntax. It’s more clear about what’s going on (I’m taking an object and pushing it through a bunch of functions), while x.{...} doesn’t give any sense about which direction inputs are flowing, in addition to looking a lot like broadcasting.

That being said, I’m not going to suggest holding up such an amazing feature over minor bikeshedding like this. Thank you so much for implementing this @uniment!

1 Like

I’m not aware of any popular text-based 2D programming. I’d guess for the most part people either assume it’s impossible, or only try in toy languages which make it uninteresting; this proposal is the strange intersection between: absolute madlad geniuses designing a language and parser ∩ deprecation of a context in which such powerful syntax is valid ∩ some stubborn bonehead asking “What if I try this?”

The first language I encountered that allowed me to form expressions like this was MatLab (for matrix building), and coming from C-style languages I remember it feeling weird and wrong for me too back then. Seeing it now in Julia and understanding it better, it’s not so foreign—so I suspect it’s a matter of familiarity with the language feature.

In short, it parses like matrices: a single space or a linebreak has meaning, but multiples don’t (so you can add spaces as desired for alignment), and semicolons can be used in place of linebreaks. It’s [mostly] consistent with the rest of Julia (and inconsistent with Python).

I suspect the answer is: try it and see how you like it! Maybe it’s an acquired taste that I got from doing too much matrix math where similar syntax is valid, but I feel like if I can get onboard with it, anybody can.

I think my main reservations with this approach are that it’s much more verbose and it locks the parallel chains’ executions into lockstep, making multithreading essentially impossible. As it is, you can add spaces to align expressions as you wish.

You could be right. Part of my reason to explore multi-chains in this proposal is in recognition of the special powerful parsing machinery that {} comes with, to try and see if I can do it justice while maintaining coherence with 1D chain semantics (that is to say, I don’t want to waste powerful {} syntax on something that can’t grow into it). But perhaps 2D chains are too jarring for people to accept readily. I don’t know the history, but I have to imagine mathematicians went mad when matrices were first invented. (Come to think of it, I didn’t like matrices either when I first learned of them :laughing:)

You seem quite opposed to the it keyword, as you’ve brought it up a couple times! :sweat_smile: Unfortunately, eliminating it would sufficiently kneecap the proposal as to make it largely uninteresting (i.e., no longer able to call n-arg functions nor do “quick lambda” expressions), so I don’t see that as a workable approach. Furthermore, if the rest of the proposal were to be implemented before any local keywords were chosen, then implementing keywords later would be a breaking change; keywords must be selected upfront to avoid this.

I feel like the reasons I’ve offered for choosing it have been pretty thorough, well-reasoned, and compelling. (color on it; semantics and valid identifiers; keyword locality and pronoun universality). Can you explain your reasoning for continuing to oppose it, so I can better understand your viewpoint?

As far as I can tell, this would require breaking changes to the parser, so I cannot support this. Otherwise I would!

That’s fine, this proposal allows you to do that.

I would’ve preferred x{f, g, h}, but it was already taken :sweat_smile:

The way I cope is by telling myself that it also looks sorta like property access, in similar spirit to how “do a length measurement on the object” (length(obj)) is equivalent to “measure the object’s length” (obj.{length}). There could be better ways to cope, but this is what I’ve found.

Thanks! :pray: Let’s see where it goes. I want to spend more time testing it to see if it’s the best we can come up with; maybe we can get some serious counter-proposals with more compelling semantics, or maybe somebody can find a better use for Julia’s powerful {} parsing machinery!

1 Like

Yeah, I don’t think it’s too bad–this is actually pretty common idiomatic Java, e.g. this example I took off a

List<Object> foo = Stream
    .concat(
        reverseStream(
            list1.stream()
                .map(Some::func)
                .flatMap(other::stuff)),
        list2.stream()
            .map(Some::func)
            .flatMap(other::stuff)))
    .map(Some::otherFunc)
    .collect(Collectors.toList());

Though perhaps being idiomatic Java should be disqualifying for inclusion in Julia.

Hmm, in that case maybe just sticking with treating the objects as tuples works best.

You’re not allowed to make tuples with missing elements separated by commas, either.

As for why not to use tuples, I offered this:

The way it works, you can use tuples in single-chains if you wish to avoid multi-chains, but I don’t see how it makes sense to impose that constraint on yourself when multi-chain syntax is more succinct and [potentially] parallelizable.

To me the broadcast syntax is an issue, presumably this would be used on collections a lot, and having to change the order of things or the syntax to broadcast looks quite arcane to me ({it+1, abs2}.((0,1,2,3)), (0,1,2,3).{{it+1, abs2}.(it)}).

This seems too nice to pass on (even though it might not be possible to implement currently):

f(x)    # single element   
f.(x)   # broadcast      

x{f}    # single element     
x.{f}   # broadcast

Maybe just a small bug :

julia> @mc {it.x = 2 - 1, sin(it.x)}(r)

0.8414709848078965

julia> @mc {it.x = 2 -1, sin(it.x)}(r)

**ERROR:** syntax: unexpected comma in array expression

I agree! Unfortunately, x{f} is already claimed (for a pretty important feature of the language :wink:), or I definitely would have taken this approach. I’m open to ideas.

Wow I wasn’t expecting someone to run into this so fast. :sweat_smile: It’s an artifact of how arrays are processed, which is slightly different than normal syntax outside of arrays:

julia> [1 - 2] # 1-element column vector
1-element Vector{Int64}:
 -1

julia> [1 -2] # 2-element row vector
1×2 Matrix{Int64}:
 1  -2

julia> [1-2,] # 1-element column vector
1-element Vector{Int64}:
 -1

julia> [1 -2,] # error
ERROR: syntax: unexpected comma in array expression

It throws an error here because , commas aren’t supposed to appear in matrix definitions.

We can also see it in another context where you might not expect it: space-delimited macro arguments.

julia> @show(1 -2) # comma-delimited b/c inside parentheses
1 - 2 = -1
-1

julia> @show 1 - 2 # one argument
1 - 2 = -1
-1

julia> @show 1 -2 # two arguments
1 = 1
-2 = -2
-2

Another mildly unexpected behavior:

julia> abstract = 1; type = 2;

julia> [type abstract]
1×2 Matrix{Int64}:
 2  1

julia> [abstract type]
ERROR: syntax: unexpected "]"

I wasn’t expecting anyone to bump into it so fast. It’s simply how the parsing machinery works for [] matrices, which extends to {} (but *not* () tuples or blocks). I don’t think it’s terrible, considering how easy it is to avoid (i.e., don’t name variables abstract and type, and avoid writing - in a way that suggests the use of the unary operator unless it’s intended to make multi-chains), but it’s possible to have unexpected behavior, and it can be pretty non-obvious why an error is being thrown.

Maybe worth a note? Or? I’m open to ideas.

1 Like

This is a fun thread! But I want to draw attention to the heart of the proposal.

What we want

  1. Easy-to-type, left-to-right chainable function composition

    Like x |> f |> g or x.f().g() instead of g(f(x)).

  2. Easy-to-type partial function application

    We already have x -> f(x, a), but we want f(_, a), or even just f(a) if the first argument is implied.

We have this with macros

Both these conveniences are readily offered with, e.g., Chain.jl (as noted above).

using Chain
@chain "hello" split("") _.^2 join("•") uppercase
# …is the same as…
uppercase(join(split("hello", "").^2, "•"))

So why are we still talking about it? Well:

What @uniment’s MethodChains.jl proposal offers

The main selling point seems to be repurposing brace syntax { } to be function composition with built-in support for partial application.

There is a lot more to this proposal, but I’m not so sure about the rest of it.

Using braces as a Chain.jl syntax

We can achieve the core of this proposal simply by transforming the brace syntax into a Chain.jl-style chain:

using Chain
using MacroTools: postwalk

bracechains(expr) = postwalk(expr) do node
	if node isa Expr && node.head ∈ (:braces, :bracescat)
		arg = gensym()
		:($arg -> @chain $arg $(node.args...))
	else
		node
	end
end

To enable this syntax transformation in the REPL, run:

pushfirst!(Base.active_repl_backend.ast_transforms, bracechains)

Then you can do things like:

julia> "chains" |> {
           split("")
           _ .^ 2
           join("•")
           uppercase
       }
"CC•HH•AA•II•NN•SS"

julia> f = {repr, reverse, "(( $_ ))"}
#73 (generic function with 1 method)

julia> f(0xCAFE)
"(( efacx0 ))"

Differences

Personally, I find x |> {f, g} better than x.{f, g}, since to a Julian the former looks like function application while the latter looks like broadcasting.

I also prefer Chain.jl’s use of _ as the anonymous argument over it or any alphanumeric name.

Overall, this brace syntax is nice, but I’m not convinced it’s that much better than using Chain.jl as-is. Though nobody can deny it’s easier to type!

I think we should keep playing around like this… but not get too carried away! In the end, the simpler the better.

16 Likes

I can - typing { on my german keyboard requires pressing ALTGR (on the right side of the spacebar) as well as 7 (0 for }) in the number row (also on the right side of the keyboard). That’s more difficult to type than @ or |>, since those are on the left side of the keyboard, allowing me to use my left hand as well.

3 Likes

Quite creative, I like it!

I think there’s a certain nuance that’s missing though, which is:

Namely, I’m trying to create a concept which is as general as possible, and as a result, trying to avoid the behavior of threading the argument into a default position. Additionally, as has been expressed previously, the use of _ is somewhat wasteful, when within the local context it is possible to define new keywords—this would keep _ free for use in partial application, as PR #24990 proposes.

Indeed, if this proposal were to be accepted and #24990 were accepted (which is my hope), then many of the expressions here could look closer to Chain.jl:

"chains".{
    split(_, "")
    _.^2
    join(_, "•")
    uppercase
}

The other points that this misses are: 1) creation of an unneeded lambda (increases compile time) and 2) operator precedence. It’s frequent for chains to be short sub-expressions inside larger ones (e.g., arr.{filter(f, _), first}.a+1), and the lower precedence of |> forces more of it to be placed within the chain, e.g. arr |> {filter(f, _), first, it.a+1}, which is clumsier. The high precedence of . is useful.

Oh dear, now I understand the dislike for curly braces! :sweat_smile: How prevalent are such keyboards?

2 Likes

As far as I know, belgian and french too…

3 Likes

Another point I think bears re-emphasis:

A large motivation for pushing for acceptance as a language feature, is to get autocomplete support. Not only would this improve method discoverability as the OO folks have, but it would also make it so that threading into a default argument position (first for Chain.jl, last for DataPipes.jl) isn’t as compelling: for example, if this proposal and PR #24990 were accepted, then in "1:2:3".{split(_, ":")}, an autocomplete would likely fill out (_, ) and place the cursor after the comma, saving the effort of typing the underscore.

(Before I catch flak for the compile time of constructing a partial applicator, in proposal #2 I proposed giving _ preferred treatment within the chaining syntax so that it would be syntax-transformed into a simple function call. I didn’t restate that in this proposal, but I do still carry that intent. Perhaps I should code it in.)

As a result, I feel inclined to oppose the behavior of automatically threading into any default argument position, because the effort it saves would be negligible.

This is not good. How do they manage C-style languages? And in Julia, are they much less inclined to use type parameterization?

I suppose if we went forward with this proposal, an autocomplete would become popular very quickly: when typing obj., probably the first option to appear should be obj.{} so that these characters need not be typed. :sweat_smile: (of course, they’d be only entered if you hit <tab>.)

we suffer.

9 Likes

Well, it’s not that bad. For belgian, thumb and index at aligned (see link). But for french, indeed, that’s more painful …
Belgian keyboard layout
French keyboard layout
Bottom line, keep in mind the vast majority of languages use curly braces, and this doesn’t prevent these countries (Germany, France, Belgium, other …) to code in e.g. javascript / C / Go / etc. … :upside_down_face:

2 Likes

I did too, until I decided to do all coding with a US keyboard layout. It works well because in code files I write everything in English anyway. And every OS has a standard keyboard shortcut to switch layout for writing documents in French/German/etc. I’m much happier ever since, also having less strain in my fingers when coding.

3 Likes

[Off-topic? I.e. this post, as some others, only about typing on different keyboards, e.g. braces.]

That applies to e.g. the Icelandic keyboard too! For me at least it feel very natural to type in { and }. I suppose if people opposed very much Java, C and C++ and other curly-brace languages wouldn’t be popular (in Germany and some places)…

You do get people occasionally asking for this in Julia instead of begin … end. I’m very pro on not doing that (and it will not happen), not because of your objection, but it’s more useful for other things. Would this new syntax be most useful way of explaining those brackets?

Well @ for me is AltGr and Q so not easier to type with one hand… (I never understood why the other Alt no [allowed to] working, since it does nothing) as I type the braces. I suppose with two hands as I do slightly easier than the braces with one (or two hands, if I were to adjust to doing that).

Wow, that’s an awful (looking, e.g. for M) keyboard, with ( typed as, I suppose AltGR and, 5 (plus it being AZERTY, at least odd to me); or maybe not I guess you no longer for for using one hand (which was my first thought when responding, since I’m used to that).

1 Like

I agree with this summary. I would be curious to see the reception of the simplest parts of this, which is more or less “Chain.jl but with braces,” and with the differences you propose since I also prefer each of those, submitted as an actual PR into Base. It seems there is sometimes a different nature of discussion that happens on GitHub than on Discourse.

2 Likes