Fixing the Piping/Chaining Issue (Rev 3)

On a more concrete note - while it and them works well in a macro, it doesn’t work as well on the language level. If curly brace syntax is supposed to be equivalent to anonymous functions (with implicit argument lists), what happens if there’s already a it = ... in the block containing it? Is there capturing? No capturing? This is the problem with adding new keywords and ultimately why the proposals in the past used _, because that can’t be used as a variable name. This is not a problem in a macro because there it’s clear that everything is a DSL, with different semantics than in the base language (though lots of macros settled on the underscore anyway, to not be ambiguous with Base).

Overall though, this just feels much closer to being a new language than julia to me.

7 Likes

If this is the remaining objection, then I’m not too concerned. :wink:

Interesting take, considering that the active participants in each thread has been different each time. It seems instead that this thread is less of dogpile and shouting match, and more of a skeptical and pensive “hmm, that’s interesting…”. I think that’s a step in the right direction.

Not true. it and them are fully-local arguments to the chain, and they do not capture from their environment. In essence, they are locally defined keywords.

Locally-defined keywords aren’t unfamiliar if you have ever used the as keyword, or the abstract keyword, or the type keyword, which are keywords when used in a certain context, but outside of that context can be assigned other values.

Indeed that has been one of my concerns too. This is, in part, why I’ve chosen to adopt concepts and keywords from natural language—so that if a concept is foreign to mathematics or other programming languages, at least it is familiar to anyone who speaks English.

Edit: The concepts are also familiar to people who don’t speak English, but the keywords are different. German has “es” and “sie,” Spanish has “lo/la” and “los/las,” and I know I shouldn’t trust Wikipedia, but it says Chinese has “tā” and “tāmen.” The universality of singular and plural pronouns makes this an attractive concept to piggyback on. I have chosen English pronouns because Julia uses English keywords.

1 Like

Just for fun. I hope your proposals are more successful.
Bjarne Stroustrup: the tragedy of the rejection of the unified function call syntax proposal (in C++) (timestamp at 45m42s, the question was asked a few secs before)

1 Like

Indeed, my proposal here is essentially for a generalized unified function call syntax: generalized across argument positions, generalized to include the flexible transformations that our natural languages afford us (using pronouns), generalized for bundling sequences of calls into a callable object, and even generalized across two dimensions. And now perhaps, the complaint is that it’s too general :stuck_out_tongue_closed_eyes:

In the words of Bjarne, thus far I find the arguments against it “wholly insufficient.” Perhaps we can shake up this “echo chamber” with a more robust debate, if there is one to be had?

It’s not that one proposal for unified function call syntax was rejected, it’s that there have been multiple different proposals that all achieved slightly different things, where only one made it to an actual vote. Which failed to reach a consensus because this is very much a matter of taste and different people want different things out of it. You can read a bit about it here (includes sources!).

That situation is not too different from the many different kinds of proposals for this & adjacent things in julia. The difference is that we have a powerful, proper macro system that can accommodate all of them, even without changing the core language - hence, there is much less need to add one controversial thing.

1 Like

This is both a blessing and a curse. As has been discussed ad nauseam in this thread and summarized in the OP of this thread, there are compelling benefits to choosing a single, officially-ordained approach for method chaining. It is, of course, fine for other approaches to exist too, as will occur when a domain-specific case justifies it.

This brings to mind this tweet thread on group decision-making.

Update: added type assertions and [experimental] special do statement syntax.

Not my finest work, but it’ll have to do for now:

# experimental `do` statement syntax
open("./input/2.txt") do {
    ::IO # new optional arg type assertion

    # setup
    oppmoves = (rock='A', paper='B', scissors='C')
    mymoves = (rock='X', paper='Y', scissors='Z')
    outcome(oppmove, mymove) = begin
        (oppmove, mymove) ∈ ((:rock, :paper), (:paper, :scissors), (:scissors, :rock)) && return 1
        (oppmove, mymove) ∈ ((:paper, :rock), (:scissors, :paper), (:rock, :scissors)) && return -1
        0
    end

    # process
    read(it, String)
    split(it, "\n")
    map(it) do {
        try 
            oppmove = findfirst(==(it[1]), oppmoves)
            mymove = findfirst(==(it[3]), mymoves)
            score = findfirst(==(it[3]), values(mymoves))
            score += 3 + outcome(oppmove, mymove) * 3
        catch; 0 end
    } end
    sum
} end

(my preference of course, would be for f(arg) do {stuff} not to require an end keyword, but I can’t do that without parser changes.)

Regarding incorporating the proposal in the language, I was wondering if a possible path would be to incorporate it first as a macro in a pull request (so that if/when merged it will be fully supported and tested at release time like any other part of the language) and then at a later time, if there is strong usage/demand, incorporate it fully and make @mc a no-op for backward compatibility. I guess it could make creating a PR to the language easier if that’s the official way to make a proposal more serious. Just a thought, not sure if it makes a difference or not.

5 Likes

I really like the x.{f, g(it), h} syntax. It’s a nice solution to cases where pure functional programming is awkward, and the syntax feels quite easy to grasp. And yes, it would be awesome if using <tab> was as helpful as OOP languages.

However, I will say that the multi-chain syntax feels inappropriate to use directly inside a text-based programming language. This kind of 2D programming seems best suited for full-fledged visual programming languages like Blender’s graph editor, LabVIEW, Microsoft Excel, Scratch, etc. In a text-based language it just looks so out-of-place. Are there any other languages that have something like this? I also feel like spaces should never have that much power (even Python’s explicit use of spaces to demarcate a scope feels illegal). Maybe it’s just me though. e.g., I would rather the multi-chains just output 3-tuples, and each step use one of it[1], it[2], it[3].

Maybe one day, but I think it will be harder to gain traction if the multi-chain syntax is included in the proposed change, whereas with {} it seems feasible to win support for.

e.g., writing the example without the multi-chain seems fine to me

(0:10...,).{
   avg = {len=it.{length}, sum(it)/len}
   μ = it.{avg}
   it .- μ
   (it.^2,       it.^2,       abs.(it)      )
   (avg(it[1]),  avg(it[2]),  maximum(it[3]))
   (sqrt(it[1]), it[2],       it[3]         )
}

and you could space it out as you wish, instead of having the spacing control the function.

9 Likes

Agreed with @MilesCranmer . If serious about getting into Base (not that I have any say whatsoever in the matter) it may be better to add the functionality incrementally. e.g. start with “just” the chaining call syntax like x.{f, g} as sugar for (_ -> g(f(_)))(x)

Some of the other parts of the proposal, like the it keyword or the multi-chains can always be added later if there is demand for them. There is no need to bundle all the proposed functionality together.

3 Likes

I second the use of commas instead of whitespace; it looks cleaner in my opinion. (I think we should let people drop the parentheses, though, since they’re not providing any extra information.)

Using commas also solves the problem of figuring out which values to discard. If you want to drop a value, just leave the space before/after a comma blank instead of writing it or _. This also lets you avoid using two separate keywords (it and them) when just one, like _, would do.

(0:10...,).{
   mean
   _ .^2, _.^2, abs.(_)
   _ .- 3
   , _,
   sqrt(_)
}

i like this – mentally i see the dot as broadcasting. the pipe is one extra key press but i like it.

1 Like

I think I agree that I slightly prefer the pipe syntax. It’s more clear about what’s going on (I’m taking an object and pushing it through a bunch of functions), while x.{...} doesn’t give any sense about which direction inputs are flowing, in addition to looking a lot like broadcasting.

That being said, I’m not going to suggest holding up such an amazing feature over minor bikeshedding like this. Thank you so much for implementing this @uniment!

1 Like

I’m not aware of any popular text-based 2D programming. I’d guess for the most part people either assume it’s impossible, or only try in toy languages which make it uninteresting; this proposal is the strange intersection between: absolute madlad geniuses designing a language and parser ∩ deprecation of a context in which such powerful syntax is valid ∩ some stubborn bonehead asking “What if I try this?”

The first language I encountered that allowed me to form expressions like this was MatLab (for matrix building), and coming from C-style languages I remember it feeling weird and wrong for me too back then. Seeing it now in Julia and understanding it better, it’s not so foreign—so I suspect it’s a matter of familiarity with the language feature.

In short, it parses like matrices: a single space or a linebreak has meaning, but multiples don’t (so you can add spaces as desired for alignment), and semicolons can be used in place of linebreaks. It’s [mostly] consistent with the rest of Julia (and inconsistent with Python).

I suspect the answer is: try it and see how you like it! Maybe it’s an acquired taste that I got from doing too much matrix math where similar syntax is valid, but I feel like if I can get onboard with it, anybody can.

I think my main reservations with this approach are that it’s much more verbose and it locks the parallel chains’ executions into lockstep, making multithreading essentially impossible. As it is, you can add spaces to align expressions as you wish.

You could be right. Part of my reason to explore multi-chains in this proposal is in recognition of the special powerful parsing machinery that {} comes with, to try and see if I can do it justice while maintaining coherence with 1D chain semantics (that is to say, I don’t want to waste powerful {} syntax on something that can’t grow into it). But perhaps 2D chains are too jarring for people to accept readily. I don’t know the history, but I have to imagine mathematicians went mad when matrices were first invented. (Come to think of it, I didn’t like matrices either when I first learned of them :laughing:)

You seem quite opposed to the it keyword, as you’ve brought it up a couple times! :sweat_smile: Unfortunately, eliminating it would sufficiently kneecap the proposal as to make it largely uninteresting (i.e., no longer able to call n-arg functions nor do “quick lambda” expressions), so I don’t see that as a workable approach. Furthermore, if the rest of the proposal were to be implemented before any local keywords were chosen, then implementing keywords later would be a breaking change; keywords must be selected upfront to avoid this.

I feel like the reasons I’ve offered for choosing it have been pretty thorough, well-reasoned, and compelling. (color on it; semantics and valid identifiers; keyword locality and pronoun universality). Can you explain your reasoning for continuing to oppose it, so I can better understand your viewpoint?

As far as I can tell, this would require breaking changes to the parser, so I cannot support this. Otherwise I would!

That’s fine, this proposal allows you to do that.

I would’ve preferred x{f, g, h}, but it was already taken :sweat_smile:

The way I cope is by telling myself that it also looks sorta like property access, in similar spirit to how “do a length measurement on the object” (length(obj)) is equivalent to “measure the object’s length” (obj.{length}). There could be better ways to cope, but this is what I’ve found.

Thanks! :pray: Let’s see where it goes. I want to spend more time testing it to see if it’s the best we can come up with; maybe we can get some serious counter-proposals with more compelling semantics, or maybe somebody can find a better use for Julia’s powerful {} parsing machinery!

1 Like

Yeah, I don’t think it’s too bad–this is actually pretty common idiomatic Java, e.g. this example I took off a

List<Object> foo = Stream
    .concat(
        reverseStream(
            list1.stream()
                .map(Some::func)
                .flatMap(other::stuff)),
        list2.stream()
            .map(Some::func)
            .flatMap(other::stuff)))
    .map(Some::otherFunc)
    .collect(Collectors.toList());

Though perhaps being idiomatic Java should be disqualifying for inclusion in Julia.

Hmm, in that case maybe just sticking with treating the objects as tuples works best.

You’re not allowed to make tuples with missing elements separated by commas, either.

As for why not to use tuples, I offered this:

The way it works, you can use tuples in single-chains if you wish to avoid multi-chains, but I don’t see how it makes sense to impose that constraint on yourself when multi-chain syntax is more succinct and [potentially] parallelizable.

To me the broadcast syntax is an issue, presumably this would be used on collections a lot, and having to change the order of things or the syntax to broadcast looks quite arcane to me ({it+1, abs2}.((0,1,2,3)), (0,1,2,3).{{it+1, abs2}.(it)}).

This seems too nice to pass on (even though it might not be possible to implement currently):

f(x)    # single element   
f.(x)   # broadcast      

x{f}    # single element     
x.{f}   # broadcast

Maybe just a small bug :

julia> @mc {it.x = 2 - 1, sin(it.x)}(r)

0.8414709848078965

julia> @mc {it.x = 2 -1, sin(it.x)}(r)

**ERROR:** syntax: unexpected comma in array expression

I agree! Unfortunately, x{f} is already claimed (for a pretty important feature of the language :wink:), or I definitely would have taken this approach. I’m open to ideas.

Wow I wasn’t expecting someone to run into this so fast. :sweat_smile: It’s an artifact of how arrays are processed, which is slightly different than normal syntax outside of arrays:

julia> [1 - 2] # 1-element column vector
1-element Vector{Int64}:
 -1

julia> [1 -2] # 2-element row vector
1×2 Matrix{Int64}:
 1  -2

julia> [1-2,] # 1-element column vector
1-element Vector{Int64}:
 -1

julia> [1 -2,] # error
ERROR: syntax: unexpected comma in array expression

It throws an error here because , commas aren’t supposed to appear in matrix definitions.

We can also see it in another context where you might not expect it: space-delimited macro arguments.

julia> @show(1 -2) # comma-delimited b/c inside parentheses
1 - 2 = -1
-1

julia> @show 1 - 2 # one argument
1 - 2 = -1
-1

julia> @show 1 -2 # two arguments
1 = 1
-2 = -2
-2

Another mildly unexpected behavior:

julia> abstract = 1; type = 2;

julia> [type abstract]
1×2 Matrix{Int64}:
 2  1

julia> [abstract type]
ERROR: syntax: unexpected "]"

I wasn’t expecting anyone to bump into it so fast. It’s simply how the parsing machinery works for [] matrices, which extends to {} (but *not* () tuples or blocks). I don’t think it’s terrible, considering how easy it is to avoid (i.e., don’t name variables abstract and type, and avoid writing - in a way that suggests the use of the unary operator unless it’s intended to make multi-chains), but it’s possible to have unexpected behavior, and it can be pretty non-obvious why an error is being thrown.

Maybe worth a note? Or? I’m open to ideas.

1 Like

This is a fun thread! But I want to draw attention to the heart of the proposal.

What we want

  1. Easy-to-type, left-to-right chainable function composition

    Like x |> f |> g or x.f().g() instead of g(f(x)).

  2. Easy-to-type partial function application

    We already have x -> f(x, a), but we want f(_, a), or even just f(a) if the first argument is implied.

We have this with macros

Both these conveniences are readily offered with, e.g., Chain.jl (as noted above).

using Chain
@chain "hello" split("") _.^2 join("•") uppercase
# …is the same as…
uppercase(join(split("hello", "").^2, "•"))

So why are we still talking about it? Well:

What @uniment’s MethodChains.jl proposal offers

The main selling point seems to be repurposing brace syntax { } to be function composition with built-in support for partial application.

There is a lot more to this proposal, but I’m not so sure about the rest of it.

Using braces as a Chain.jl syntax

We can achieve the core of this proposal simply by transforming the brace syntax into a Chain.jl-style chain:

using Chain
using MacroTools: postwalk

bracechains(expr) = postwalk(expr) do node
	if node isa Expr && node.head ∈ (:braces, :bracescat)
		arg = gensym()
		:($arg -> @chain $arg $(node.args...))
	else
		node
	end
end

To enable this syntax transformation in the REPL, run:

pushfirst!(Base.active_repl_backend.ast_transforms, bracechains)

Then you can do things like:

julia> "chains" |> {
           split("")
           _ .^ 2
           join("•")
           uppercase
       }
"CC•HH•AA•II•NN•SS"

julia> f = {repr, reverse, "(( $_ ))"}
#73 (generic function with 1 method)

julia> f(0xCAFE)
"(( efacx0 ))"

Differences

Personally, I find x |> {f, g} better than x.{f, g}, since to a Julian the former looks like function application while the latter looks like broadcasting.

I also prefer Chain.jl’s use of _ as the anonymous argument over it or any alphanumeric name.

Overall, this brace syntax is nice, but I’m not convinced it’s that much better than using Chain.jl as-is. Though nobody can deny it’s easier to type!

I think we should keep playing around like this… but not get too carried away! In the end, the simpler the better.

16 Likes