Fixing the Piping/Chaining Issue

Would you still feel that |> is insufficient if underscore placeholder syntax was part of the language?

Namely, if you could type this:

my_data |> filter(filtering_func, _)

and it would execute this:

filter(filtering_func, my_data)

would that satisfy your needs?

1 Like

Presently, I have a typed functor that could be appropriate for placeholder partial application:

struct Fix{F,NA,NKW,I,V,KW}
    f::F                          # function
    nargs::NA                     # number of args; `nothing` for unlimited (varargs)
    nkwargs::NKW                  # number of kwargs; `nothing` for unlimited (varkwargs)
    fixargs::V                    # fixed arg values
                                  # ::I is tuple of fixed arg indices
    fixkwargs::KW                 # fixed kwargs (keys & values)
end

Fix(f, nargs::Union{Int,Nothing}, nkwargs::Union{Int,Nothing}, fixindices::Tuple, fixargs::Tuple; fixkwargs...) = begin
    length(fixindices) > 0 && @assert all(fixindices[begin:end-1] .< fixindices[begin+1:end]) "all indices in increasing order please"
    length(fixkwargs) > 0  && @assert all(keys(fixkwargs)[begin:end-1] .< keys(fixkwargs)[begin+1:end]) "all keyword arguments in alphabetical order please"
    Fix{typeof(f), typeof(nargs), typeof(nkwargs), fixindices, typeof(fixargs), typeof(fixkwargs)}(f, nargs, nkwargs, fixargs, fixkwargs)
end
Fix(f, nargs::Union{Int,Nothing}, nkwargs::Union{Int,Nothing}, fixindices::Tuple, fixargs::Tuple, fixkwargs) = 
    Fix(f, nargs, nkwargs, fixindices, fixargs; fixkwargs...)
Fix(f, fixindices::Tuple, fixargs::Tuple; fixkwargs...) = 
    Fix(f, nothing, nothing, fixindices, fixargs, fixkwargs)
Fix(f, fixindices::Tuple, fixargs::Tuple, fixkwargs) = 
    Fix(f, fixindices, fixargs; fixkwargs...)

(fix::Fix{F,NA,NKW,I,V,KW})(args...; kwargs...) where {F,NA,NKW,I,V,KW} = begin
    arglen = length(args)+length(I)
    argitr = ((1:arglen)...,)

    fixarg_map = (I, fix.fixargs) # tuple of fixed arg indices and tuple of fixed arg values
    argsI = filter(i-> i ∉ I && i-arglen-1 ∉ I, argitr) # arguments not fixed must be called
    arg_map = (argsI, args) # tuple of called arg indices and tuple of called arg values

    argsout = map(argitr) do i
        which_map = (i ∈ I || i-arglen-1 ∈ I) ? fixarg_map : arg_map
        which_map[2][findfirst(i-arglen-1 ∈ I ? ==(i-arglen-1) : ==(i), which_map[1])]
    end
    isnothing(fix.nargs) || @assert length(argsout) == fix.nargs
    kwargs = (; kwargs..., fix.fixkwargs...)
    isnothing(fix.nkwargs) || @assert length(kwargs) == fix.nkwargs
    fix.f(argsout...; kwargs...)
end

To construct and call this object is easy:

f = (a, b, c, d) -> (a, b, c, d)
g = Fix(f, (1,), (:a,)) # funcname, fixed arg indices, fixed arg values
g(:b, :c, :d)
h = Fix(f, (1,2), (:a,:b))
h(:c, :d)
i = Fix(f, (1,2,3), (:a,:b,:c))
i(:d)
j = Fix(f, (1,2,3,4), (:a,:b,:c,:d))
j()

Negative indices denote distance from the end of the argument list (-1 for the end, -2 for next-to-end, etc.)

k = Fix(f, (-1,1), (:d,:a))
k(:b, :c)

Keyword arguments are allowed, and additional arguments can be inserted to specify that the called function has a fixed number of arguments (instead of varargs).

Problem

At the moment it seems to work fine, except: when fixing zero or one arguments it is type stable, but with two or more arguments fixed the functor’s call to map loses type stability. I’m struggling to figure out why.

Example:

julia> @btime Fix((a,b,c,d)->(a,b,c,d), (1,), (1,))(2, 3, 4)
  1.000 ns (0 allocations: 0 bytes)
(1, 2, 3, 4)

julia> @btime Fix((a,b,c,d)->(a,b,c,d), (1,2), (1,2))(3, 4)
  778.505 ns (17 allocations: 640 bytes)
(1, 2, 3, 4)
1 Like

I would guess one reason is that arglen is an Int you are passing into a closure and that pushes it down to a runtime value.

And instead of using argitr at all you should do

argsout = ntuple(arglen) do i
    ...
end

You just have to make sure you don’t drop compile time constants down to runtime values anywhere.

1 Like

After spending some time looking at all this, to me, this would be very useful. But that would be under the expectation that this placeholder syntax is general, meaning that it’s actually syntactic sugar for, let’s call it smart lambdas (SL), that works anywhere. isodd = _ % 2 == 1 for example.

I would then find that your OP suggestion would be very useful in addition to SL, if we make it do partial application (or eval, if conditions are met), which also works anywhere. With these two together, we would have both conciseness and clarity, and I believe that would be apparent in real world examples.

Here’s how that would look in regards to our earlier filtering shenanigans:

# today, assuming we shun magic one-parameter functions
[] |> list -> filter(n -> n > 2, list)

# smart lambdas
[] |> filter(_ > 2, _)

# OP (sane version)
[] \> filter(n -> n > 2)

# SL + OP
[] \> filter(_ > 2)

In my opinion, the last entry is both concise and clear in meaning; noticeably clearer than just the SL one as well.

The idea with partial application would also be very useful in general, since the following (shunning magic functions) becomes much clearer IMO:

# today, prepping a list to have several functions applied
prepped_list = fn -> foreach(fn, some_list)

# SL vs OP
prepped_list = foreach(_, some_list)
prepped_list = some_list \> foreach

# today, prepping a function to be applied to several lists
apply_fn = list -> foreach(some_fn, list)

# SL vs OP
apply_fn = foreach(some_fn, _)
apply_fn = some_fn /> foreach

(Note how OP references only the things actually used.)

No idea if it’s viable in Julia or not, or if it’s worth the trouble, but it would probably make a lot of the functional style stuff less cumbersome to code and parse mentally.

1 Like

It might make sense to use a generated function for applying a FixArgs functor. Here’s my naive implementation. It doesn’t have all the bells and whistles, but I compare the performance with and without a generated function.

struct FixArgsNotGenerated{F, I, V}
    f::F
    vals::V
end

function FixArgsNotGenerated{I}(f, vals) where I
    FixArgsNotGenerated{typeof(f), I, typeof(vals)}(f, vals)
end

function (f::FixArgsNotGenerated{F, I, V})(args...) where {F, I, V}
    front_args = []
    fixed_args_i = 0
    unfixed_args_i = 0

    last_I = last(I)

    for full_i in 1:last_I
        if full_i in I
            fixed_args_i += 1
            push!(front_args, f.vals[fixed_args_i])
        else
            unfixed_args_i += 1
            push!(front_args, args[unfixed_args_i])
        end
    end

    f.f(front_args..., args[unfixed_args_i+1:end]...)
end

struct FixArgsGenerated{F, I, V}
    f::F
    vals::V
end

function FixArgsGenerated{I}(f, vals) where I
    FixArgsGenerated{typeof(f), I, typeof(vals)}(f, vals)
end

@generated function (f::FixArgsGenerated{F, I, V})(args...) where {F, I, V}
    front_args = []
    fixed_args_i = 0
    unfixed_args_i = 0

    last_I = last(I)

    for full_i in 1:last_I
        if full_i in I
            fixed_args_i += 1
            push!(front_args, :(f.vals[$fixed_args_i]))
        else
            unfixed_args_i += 1
            push!(front_args, :(args[$unfixed_args_i]))
        end
    end

    :(f.f($(front_args...), args[$(unfixed_args_i + 1) : end]...))
end

Benchmark code:

using BenchmarkTools

foo(a, b, c, d) = (a, b, c, d)
f = FixArgsNotGenerated{(2, 4)}(foo, (:b, :d))
g = FixArgsGenerated{(2, 4)}(foo, (:b, :d))

Benchmark result:

julia> @btime f(:a, :c);
  403.475 ns (6 allocations: 288 bytes)

julia> @btime g(:a, :c);
  22.708 ns (1 allocation: 48 bytes)

For prior art on FixArgs, take a look at this package:

https://goretkin.github.io/FixArgs.jl/dev/

2 Likes

Although perhaps what Raf was getting at is that if you keep everything in terms of tuple operations on type values, then the compiler can automatically unroll the loop, so you don’t even need a generated function.

1 Like

Yeah you can do this with reduce. We also need to use $f and $g to time this.

function (f::FixArgsNotGenerated{F,I,V})(args...) where {F,I,V}
    init = ((args, f.vals), ())
    inds = ntuple(identity, max(I...))
    _, combined_args = reduce(inds; init) do ((a, v), out), i
        if i in I
            ((a, Base.tail(v)), (out..., first(v)))
        else
            ((Base.tail(a), v), (out..., first(a)))
        end
    end
    f.f(combined_args...)
end

Although for some reason not quite as good as the generated function:

julia> @btime $f(:a, :c)
  5.844 ns (0 allocations: 0 bytes)
(:a, :b, :c, :d)

julia> @btime $g(:a, :c)
  3.264 ns (0 allocations: 0 bytes)
(:a, :b, :c, :d)

Edit: but these are same-typed inputs. I think the generated function is the way to go here.

2 Likes

Thanks for helping @Raf and @CameronBieganek! I’m not quite happy with it until it is perfectly transparent (so that @btime $g(:a, :c) runs in exactly the same time as @btime $foo(:a, :b, :c, :d) but this is a great start. I certainly need more practice writing efficient functional code and @generated functions.

Although I agree, I am concerned that there is sufficient overlap in what can be done with underscore partial application as with frontfix/backfix partial application, that the extra overhead of a) implementing both and b) learning both may not be worthwhile. And as what can be done with underscore partial application (assuming we get _... slurping) is a superset of what can be done with frontfix/backfix, and it seems cleaner implementation-wise, then it seems like the horse to bet on. (akshually … underscore partial app. is a superset with the exception of not being able to apply until there are zero arguments left. Maybe there is a way to correct this…)

I’m not a fan of the extra comma and underscore in the call to filter, but if the autocomplete will fill it in then I might not mind. (Also I set using Windows Powertoys a hotkey to enter an underscore.) And for readability, it’s likely better to have the placeholder there than to also have to learn (and remember) two additional operators in addition to learning how placeholders work.

Maybe you missed my last comment… its much slower with mixed types, and your tests are all symbols.

Use the generated function.

1 Like

Driving so this will be brief, but I wonder if {} couldn’t be used to denote partial application.

I’ve always felt a little weird that placeholder syntax has no embellishments, so it appears like it’s a function call, when in fact it’s returning a partially applied function…

Maybe?

[1, 2, 3] |> {filter {> _ 2} _}

Just spitballing.

To be clear, my thinking was that underscores would be solely used as shorthand for lambdas, and partial application would be handled solely by \> and />, which would keep everything simple and explicit.

Are you suggesting that underscores creates lambdas that are sometimes partially applied and sometimes not?

No, because that syntax is already parseable & used in some macros in the ecosystem, so reassigning that would be breaking.

It does not return a partially applied function, it returns an anonymous function that has some arguments already set.

Having it appear as a function call is beneficial because that’s exactly what it represents. Having distinct syntax for the same thing is confusing.


I’ve followed this discussion for a while now and since it’s very clear now that /> and \> have no benefit (implementation wise, since they also require special casing in the parser) over the _-lambdas, I fail to see how having them in addition to _ is useful.

it’s very clear now that /> and \> have no benefit over the _ -lambdas

Well, besides giving more information to the IDE for tab-complete. But I understand that doesn’t seem to be a priority for most.

The follow-up then is why not do the _-lambdas? It seems that PR & discussion has been stalled for a very long time.

What is a function that merely calls another function with some arguments already set, if not a partially applied function?

I know I shouldn’t trust Wikipedia, but even Wikipedia says Scala-style underscore placeholder syntax is partial application.

I absolutely think autocomplete should be a priority! Possibly the top one! But as outlined in this post, I’ve begun to think that all that’s necessary to inform a good autocomplete is the existence of a piping operator, and first-class treatment of partial function application: by making it a proper part of the language, instead of some random package, the effort to make autocomplete work with it will be justified. We already have the former; it seems just a matter of getting the latter.

The one thing I’m trying to figure out is how to make _ placeholder partial application syntax return a function of zero arguments. That seems to be the only remaining leg up that the frontfix/backfix operators have over it.

Option 1

Where _... is used to slurp args, this could do the trick:

my_func(x) = x
this_is_spartaaahh = my_func(:aaahh!, _...)
this_is_spartaaahh()

Or similarly, slurping kwargs:

zip_zip_zip = my_func(:zip!; _...)
zip_zip_zip()

Not technically a partial function of zero arguments, but it at least allows zero arguments.

Option 2

A second idea could be this:

wen_moon = my_func(:🚀🚀🚀, ^_)
wen_moon()

This borrows from the vibe of regular expressions to indicate “no placeheld arguments”.

Option 3

A third idea:

turtle_turtle = my_func(_=:🐢🐢)
turtle_turtle()

That is to say, allow _ to consume the = operator and then trigger creation of a partial function.

This option could require deeper changes to the parser, so I’m less confident in it.

In Any event,

it seems like there could be multiple syntax options that would allow underscore placeholder partial function syntax to create a function which can be called with zero arguments, i.e., a “partially applied” function that’s fully-applied.

This means that the syntax fully generalizes, fixing across any argument position and for all numbers of arguments, and I can feel good throwing my full support behind it. :+1:

Since underscore partial application would make piping much more powerful and useful, after getting it properly into the language the next to-do is to get an operator that’s cleaner-looking and less awkward to type than |>.

Basically yes, that fixes most of the problem with the Base |>

But, now we have this idea of first/last generalization of the pipe operator and it feels kinda compelling. I think Julian programmers have an affinity for that generalization concept, and this />, > notation seems to me has a bit more than the Scala style underscore placeholder.

Is the ’ |> with _ ’ equivalent to ’ /> >’ as a feature of the language?

Wow @generated is like having a superpower in your back pocket. Thanks @CameronBieganek for that!

Also: its performance is great. I think I was mistaken previously.

Third-pass Code for a typed arbitrary-index partial application functor
struct Fix{F,fixinds,V<:Tuple,KW<:NamedTuple}
    f::F
    fixvals::V
    fixkwargs::KW

    Fix{F,fixinds,V,KW}(f,fixvals,fixkwargs) where {F,fixinds,V,KW} = begin
        orderok(a, b) = a < b || (a > 0 && b < 0) # not a perfect test ... just want args ordered left to right
        length(fixinds) > 1 && @assert all(orderok.(fixinds[begin:end-1], fixinds[begin+1:end]))
        new{F,fixinds,V,KW}(f,fixvals,fixkwargs)
    end
end

Fix{fixinds}(f, fixvals; fixkwargs...) where {fixinds} = 
    Fix{typeof(f), fixinds, typeof(fixvals), typeof((; fixkwargs...))}(f, fixvals, (; fixkwargs...))

@generated (f::Fix{F,fixinds,V,KW})(args...; kwargs...) where {F,fixinds,V,KW} = begin
    combined_args = Vector{Expr}(undef, length(fixinds)+length(args))
    args_i = fixed_args_i = 1
    for i ∈ eachindex(combined_args)
        if any(==(fixinds[fixed_args_i]), (i, i-length(combined_args)-1))
            combined_args[i] = :(f.fixvals[$fixed_args_i])
            fixed_args_i = clamp(fixed_args_i+1, eachindex(fixinds))
        else
            combined_args[i] = :(args[$args_i])
            args_i += 1
        end
    end
    :(f.f($(combined_args...); kwargs..., f.fixkwargs...))
end

Performance:

julia> @btime Fix{(1,2,-3,-1)}((args...;kwargs...)->(args...,(;kwargs...)), (:a,:b,:getting_close,:END), (z=5))(:y, 1, 2; k=2)
  1.000 ns (0 allocations: 0 bytes)
(:a, :b, :y, 1, :getting_close, 2, :END, (k = 2, z = 5))

Notes:

  • call Fix{fixinds::Tuple}(f, fixvals::Tuple; fixkwargs...) to construct functor
    fixinds is a tuple of indices starting from left (e.g. (1, 2, 3)), and any indices counting from the right are negative (e.g., (1, 2, 3, -3, -2, -1)). Index 1 is left-most argument, -1 is right-most.
  • Fixed keyword arguments override called keyword arguments. Not sure if this is the right decision.
  • There is no check that the number of arguments or keyword arguments fit a profile; the combined argument list simply grows with number of arguments passed during call, with new arguments filling in the middle between the arguments with positive indices and the arguments with negative indices.
  • This could use some more road testing, for sure
  • FixFirst(f,x) is created by Fix{(1,)}(f, (x,)) which isa Fix{<:Any, (1,)}. It is presumed that such an object would be created by f(x, _...).
  • FixLast(f,x) is created by Fix{(-1,)}(f, (x,)) which isa Fix{<:Any, (-1,)}. It is presumed that such an object would be created by f(_..., x).
  • In many locations where Base.Fix2 is used, people will probably use f(_, x), which will create a Fix{<:Any, (2,)} object. When calling a function with two arguments, the fact that Fix{<:Any, (2,)} behaves as Fix{<:Any,(-1,)} means the type signature of a partial function which does the intended task is not unique. For the people who care about the types of the object, not sure if this matters.

In terms of what can be done, yes, what can be done with piping + “1 call” underscore partial application is a superset of what can be done with /> and \> (assuming that a _... slurp is incorporated into underscore syntax).

The primary difference is that underscore syntax doesn’t assume which argument you will pipe into; it is manually typed. This is why its functionality is a superset, but it can also be less convenient.

Addressing this, I believe that autocomplete will be useful to discover functions which accept the argument type (most likely as a first argument or last argument), and will automatically enter the underscore into the appropriate argument location.

How will this actually make it easier to autocomplete? Due to multiple dispatch and the first argument not being special on a language level, there is no distinguishing feature to take advantage here. Not with Base.Fix1, not with />/\/> and not with _.

To have really good autocomplete in julia, you need to know the argument types being passed in, which means at least a partial run of type inference to select possible methods. Special syntax for fixing the first (or any argument, really) in place does not help with actually deciding whether a method taking the supposed number of arguments even exists in the first place and thus can’t be the deciding factor for whether or not some autocomplete should/can show the method.

Why? _ as proposed in the PR is literally a placeholder for an argument. It’s not at all related to creating functions that don’t take anything.

Also, there already is syntax for anonymous functions taking in zero arguments:

julia> foo(x) = x+1
foo (generic function with 1 method)

julia> () -> foo(2)
#3 (generic function with 1 method)

julia> ans()
3

But seeing as this is completely unrelated to piping-like workflows (there’s no argument to pass in after all), I don’t see how that should have an impact on the _ PR or this discussion at all.

_... already means something - slurp the splatted arguments and ignore them:

julia> foo(x; _...) = x
foo (generic function with 1 method)

julia> foo(1; bar=1, baz=2)
1

Your proposal about _... feels like needlessly requiring the argument definition of a function to be declared at the callsites in a function, instead of at the definition of the function itself. This complicates the mental load required to parse what a given expression does, since now you have to read the whole function to even figure out how many arguments it takes.

This is not a problem with regular _ because there is no concept of “splatting something I ignore”. So this:

foo = bar(_, b, baz; _...)

Quite literally already means

foo = (x,y) -> bar(x, b, baz; y...)

There is no contextually dependent different semantic of _ here and adding one seems really confusing to me. It would also prevent expressing “I want to splat keyword arguments” with this syntax, which seems a bit odd to me to disallow.

Your computer (most likely) is clocked in the gigahertz range, meaning one instruction every ~nanosecond or faster. Getting a result in that small range is VERY likely to mean that the compiler completely folded any sort of computation away and just inlined the return of that constant. Your benchmark is not representative.

Again, to do that autocomplete needs to know the type of the object and at least has to run type inference, which does not really make sense to do when the function you’re currently writing does not parse correctly, seeing as it’s literally incomplete syntax you wish to complete.


I’m all in favor for improving autocomplete, but please, let’s stay realistic and acknowledge the true problems autocomplete faces, instead of shoe-horning in a feature that does not add anything to solve those core problems. Your proposal has moved from “I want to write/autocomplete code like in a OOP language” to “oh this fancy syntax can do unrelated thing X as well!”, which to me just seems like you’re trying to sell the syntax instead of digging into why autocomplete with the existing semantics is hard (which surface-level syntax changes have no impact on and which are the reason previous proposals to make julia more OOP-like syntax wise have failed).

1 Like

Argument positions may not be dictated by specific features of the language as class methods of an OO language, but it is still typical to place arguments in certain positions anyway because it’s good practice. A rudimentary autocomplete will assume the chained object should likely take the first argument position, or the last position—this would cover maybe 80% of use cases.

Moreover, and more importantly, it would allow more tightly specialized methods to float to the top of the list. When working with a distribution d = Beta(1, 2), I should be able to type d |> and see that specialized methods such as pdfsquaredL2norm, which specialize on an argument ::Beta, appear. I shared further thoughts here:

In recognition that underscore placeholder syntax is good for chaining, but is actually sugar for partial function application, we should think through how to make it do that job well too. Because if we don’t, we will end up with what was almost a solution to a bunch of other problems, but wasn’t quite good enough because we didn’t think it through. Maybe it’s just the engineer in me, but I have a bias toward trying to solve problems as generally as possible whenever I can. And, because,

This is only true when it’s an lvalue. Same could be said about _ in those situations.

When talking about underscore placeholder syntax, the discussion is about how to treat _ when it’s an rvalue, as an argument to function calls. Placeholder syntax builds out a partial function and uses the position of the _ as a placeholder. I simply propose that _... allow similar treatment, but for varargs.

The opposite, actually. For example, f=my_func(x, _...) would allow me to call f(a, b, c) and it would execute my_func(x, a, b, c). The way I’m proposing it, placeholder _... does not signify arguments you ignore, but vararg arguments you will fill in.

Thus, my_func(x, _..., y; _...) is a function (args...; kwargs...) -> my_func(x, args..., y; kwargs...).

I think your confusion is over my example, which showed that the partial function this_is_spartaaahh = my_func(:aaahh!, _...) could be called with zero arguments as this_is_spartaaahh(). This is not because _... signified ignoring arguments, but instead, simply an artifact of the fact that varargs are allowed to have zero length. You could easily make a vararg partial function which doesn’t permit zero arguments, e.g. f=my_func(x, _, _, _...) which would require at least two arguments.

Yes, I believe this is the point. I don’t want any runtime computation whatsoever for something that just calls another function.

If the IDE can’t determine the type of the variable that you just typed because it’s part of an incomplete blob, and therefore cannot autocomplete, then how do the OOP guys get functioning autocomplete?

I don’t have any domain-specific knowledge of autocomplete, but I’ll walk you through my thinking.

Take the example above, d = Beta(1, 2), a type from Distributions.jl. The object d has been declared to have type Beta, so when I type d |> the IDE should know that a) this is a Beta object and b) I’m about to pass it to a function (and most likely a partial function); so it should determine the available methods such as kldivergence and invlogccdf to show. After passing it through some transformation functions, if they are type-stable, the IDE should still know the type and be able to determine the available methods.

To me, it feels like the missing link to a respectable autocomplete is the ability to tell the IDE, in a way that is core to the language (and therefore worth the time and effort to develop around), what type of object I am going to call a method on before I begin looking for methods.

Am I on the right track, or am I deluded? If I’m deluded, what is it that OOP languages have that Julia doesn’t, that allows their autocomplete to work?