Fixing the Piping/Chaining Issue

It’s true that we get a lot of this with a tight-binding underscore. Just unfortunate that the _ will need to appear in every function along the chain - this is a lot of visual overhead. And the fact that the _ means two separate things in the same parentheses in filter(_>10,_) - I can’t love that, though I imagine I could accept it :slight_smile:

1 Like

A package would specialize methods of existing functions to its types - yes, but:

  • Many functions have general methods applying to Any type
  • And even specialized methods typically specialize for a single argument, not all of them

Again, actual real-life examples showing that this autocompletion is potentially useful (or not, due to too much noise) would be nice!

1 Like

I continue to think \> leans to the left and most obviously fixes the leftmost argument… While /> leans right and fixes the rightmost argument. I see I wasn’t the only one

5 Likes

Yeah, @Chris_Foster’s JuliaSyntax PR is a bit different from your original proposal. So it is effectively a new proposal. But that’s ok, it’s a drop in the bucket compared to all the proposals in PR #24990. :slight_smile:

2 Likes

It appears to me that the main difference is left-associativity rather than right-associativity?

This seems ok to me, since one of the main motivations for right associativity in the original thread was to allow chains like obj1 /> func2 /> func3() without intermediate function calls; now it has to be written more like ojb1 /> func2() /> func3(), which is very much not the end of the world. Also I want to thank again @Chris_Foster for providing a nice proof of concept to play with!

By the way, someone has raised a question on the Slack which seems related here, basically asking if there is a slick way to pipe a constant argument into the back (or presumably front) of each function in a chain of calls.

3 Likes

I feel the same way. I do not like the visual noise.

My thought however, as the two approaches are redundant, both require parsing changes, and both have their own learning curves, is that it’s likely difficult to advocate for both simultaneously.

So if I have to pick one, I am compelled to choose the one that’s more generally useful and whose behavior is more obvious.

I was more referring to the fact that my OP was being kept alive at all, as I had abandoned it :sweat_smile:

The motivation for right-associativity was not this; it was for being able to fix multiple arguments. For example, arr \> filter(isodd) could also be arr /> isodd /> filter(). This would enable things like myfilt = isodd /> filter and then arr /> myfilt().

But with underscores, it would be arr |> filter(isodd, _) and myfilt=filter(isodd,_), then arr |> myfilt.

1 Like
I had previously shared these thoughts about more sophisticated autocomplete.

However, it looks like even those ideas don’t perfectly cover the use case I currently have in mind!

The Problem:

At the moment, I’m working with a Julia package that somebody else has developed. It’s a homebrew Julia implementation of a decent-sized API that has been professionally developed for other languages (C#, Java, Python). I had considered either using the jcall library, or writing such an API myself, until I found this package; the package has over 3,000 lines so I’d really like to leverage what has already been done.

Even though it’s homebrew, it’s surprisingly well-maintained. However, like anything homebrew, it’s poorly documented. The author trusts that you will be familiar with the existing API. Unfortunately, I am not.

That said, the author appreciates Julia’s functional style, so some things are slightly different from the official API. For example, the member methods provided by the official API’s EClient object are instead globally-accessible methods specializing on a ::Connection object. Some of these methods retain their original camelCase, while some became snake_case.

All this means that, to use it, I must refer to the official API documentation (in a different language), while acknowledging that some things will be different, and while having no method discoverability in my IDE. One must be a diehard [and irrational] Julia enthusiast not to jump ship for Python.

So would an autocomplete work with it?

Using the ideas I proposed above, mostly yes. The methods that had previously been members of EClient in the official API are now specialized to take a ::Connection object as their first argument in the Julia package, so those should be discoverable.

However, quite a few functions have been written without type specialization, presumably in part because the package author didn’t export them and can rely on module encapsulation. These will prove more difficult. And as this is not the sort of library that will have a large presence of users on GitHub, there won’t be good data to draw from for statistical inference.

Is there a way to improve that?

I think there’s a way to solve that problem too, even if imperfectly. When you have an object of type MyType which has been defined in package MyPackage, it’s fairly likely that at least some methods that have been defined in MyPackage will have been written to operate on MyType objects (even if the package author hasn’t given them type annotations).

As a result, when searching for methods, it seems a fairly sensible starting point to look first in the package where the type you are working with was defined. It could be an option in the autocomplete whether to include non-exported methods or not.

My IDE is [usually] able to bring me to the line of code where a particular function or type is defined, so this seems doable.

In short, it seems that once you tell your IDE what objects you’re working with, and you’ve communicated to it that you’re about to call a function on them, it should be able to help you find methods, searching through some combination of:

  1. Type specialization
  2. Statistical inference
  3. Package co-location / code proximity

On the surface, there is quite a bit of difference between the original proposal (OP) and the JuliaSyntax PR proposal (JS). Perhaps it works out that the overall behavior is roughly the same—I haven’t figured that out yet.

Let’s take a look at an example from JS. This expression,

x  />  f(y)  \>  g(z)

is parsed as

# S-expression pseudo-code:
(chain x (/> f y) (\> g z))

which gets lowered to

chain(x, fixbutfirst(f, y), fixbutlast(g, z))

So, in JS, /> and \> are effectively unary operators that take in a function call on the right-hand side (RHS). They do not operate on the code on their left-hand side (LHS). This is in contrast to OP, where the front/back fix operators are binary operators that operate on the LHS and the RHS.

Another difference to note is that JS has an implicit piping operation built in (piping is not currying!), which is expressed by the chain function in the parsed and lowered code. OP on the other hand, does not exactly have a piping semantic, although it kind of sneaks in by the way that function calls and front/back fix operators are parsed.

The above expression is parsed rather differently in OP. OP would produce the following after parsing:

((x  />  f)(y))  \>  g)(z) 

Let’s re-write that using S-expressions, for easier comparison with JS:

((((/> x f) y) \> g) z)

That’s quite a bit different from what JS is doing. Although we should be careful to note that /> is different in OP and JS (more on that later).

Another important difference is that when you are creating a Fix* type, OP produces nested types, whereas JS produces just one type. Consider this function:

foo(x, y, z) = x, y, z

If we want to fix the first two arguments, we would do this in OP:

:b /> :a /> foo

which returns this:

FixFirst((FixFirst(foo, :a), :b)

If we want to fix the first two arguments in JS, we would do this:

\> f(:a, :b)

which calls this:

fixbutlast(f, :a, :b)

which produces a single (non-nested) FixButLast object.

Actually, this example shows that, in JS, \> is treated basically opposite to the way it is treated in OP. In JS, (/> f x) means “fix every argument except the first”, whereas in OP, (/> x f) means “fix only the first argument, and leave all the others free”. So the JS implementation actually behaves a lot more like “front pipe” and “back pipe” than like “front fix” and “back fix”. I think it would be semantically cleaner to just have “front pipe” and “back pipe” than to have “fix every argument but the first” and “fix every argument but the last”, with an implicit chain thrown in.

3 Likes

That’s great and all, but I find it quite hard to guess the usefulness of potential autocomplete without actual examples (“for objects of this type in the last position, autocompleted functions would be: …”) or even better, a basic implementation. Something simple like propose_autocompletions(func, obj, narg)::Vector{Method} would help everyone to judge how useful autocomplete could potentially be.

Thanks for the exposition, this makes sense. and to be honest the JS version seems a little simpler to me for just about the same effect.

I just want to be clear though, am I reading it correctly that

chain(x, fixbutfirst(f, y), fixbutlast(g, z))

and

((((/> x f) y) \> g) z)

Are both going to evaluate to?

g(z, f(x,y))

That is, when evaluated “all the way” to a concrete function call, the OP and JS proposals will return the same value for any sequence of \> and /> pipes, and the main differences are in which intermediate functors get constructed?

I think that’s correct. At least I haven’t found a counter-example yet. (Aside from the different intermediate functors that you mentioned, which can of course be concretely realized if you explicitly want to save a partially applied function.)

1 Like

Okay I guess I misunderstood what you meant by “actual real-life example.”

The best examples will not be of methods or types from Base, but from packages that create complicated objects with many tightly-specialized methods, for which method discovery will be most appreciated.

I’ll use DataFrames.jl as an example. Keep in mind, this is just a simple example of how autocomplete could behave, solely by acting on types.

Example Possible Autocomplete Behavior

Here's a walked-through example of how an autocomplete *could* work with underscore syntax. (click to expand)

Let’s create an object df = DataFrame(a=1, b=2). When I type

df |> 

I should see a (very very long) list of methods appear: those which specialize on typeof(df), followed by methods which specialize on supertype(typeof(df)), followed by supertype(supertype(typeof(df))), etc., sorted by degree of specialization. The list is about two thousand entries long, something like this:

df |>
  append!(df1::DataFrame, df2::AbstractDataFrame; cols, promote)
  append!(df::DataFrame, table; cols, promote)
  copy(df::DataFrame; copycols)

  ⋮ other methods of `DataFrame`
  ⋮ (vdots here to shorten my explanation)

  Array(df::AbstractDataFrame)
  ==(df1::AbstractDataFrame, df2::AbstractDataFrame)
  (Matrix)(df::AbstractDataFrame)

  ⋮ other methods of `AbstractDataFrame`

  ArgumentError(msg)
  AssertionError(msg)
  BoundsError(a)

  ⋮ other methods of `Any`

The fact that we have underscore syntax in the language means I can call any of these methods conveniently using the pipe operator. The list was simply created by calling methodswith of the type and its supertypes, with no attention paid to argument position.

Pressing CTRL+B (or something, some hotkey combination) might change settings. For example, maybe I want to see only methods that act on abstract types, in which case pressing CTRL+B could bring up:

df |>
  Array(df::AbstractDataFrame)
  ==(df1::AbstractDataFrame, df2::AbstractDataFrame)
  (Matrix)(df::AbstractDataFrame)

  ⋮ other methods of `AbstractDataFrame`

  ArgumentError(msg)
  AssertionError(msg)
  BoundsError(a)

  ⋮ other methods of `Any`

But for now, I decide I want to see methods specialized to strictly this concrete type. So I press CTRL+B again and I see:

df |>
  append!(df1::DataFrame, df2::AbstractDataFrame; cols, promote)
  append!(df::DataFrame, table; cols, promote)
  copy(df::DataFrame; copycols)
  delete!(df::DataFrame, inds)
  deleteat!(df::DataFrame, inds::InvertedIndex)
  deleteat!(df::DataFrame, inds::AbstractVector{Bool})

  ⋮

And then I can scroll down the list to find what I’m looking for. One neuron fires in my brain and I remember that the first character is a p. So I type p and I see:

df |> p
  pop!(df::DataFrame)
  popat!(df::DataFrame, i::Integer)
  popfirst!(df::DataFrame)
  prepend!(df1::DataFrame, df2::AbstractDataFrame; cols, promote)
  prepend!(df::DataFrame, table; cols, promote)
  push!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  push!(df::DataFrame, row::DataFrameRow; cols, promote)
  push!(df::DataFrame, row; promote)
  pushfirst!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  pushfirst!(df::DataFrame, row::DataFrameRow; cols, promote)
  pushfirst!(df::DataFrame, row; promote)

The list is now sufficiently short that I can see the whole thing and remind myself that the function I wanted to call was pushfirst!, and I type u. Now I see:

df |> pu
  push!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  push!(df::DataFrame, row::DataFrameRow; cols, promote)
  push!(df::DataFrame, row; promote)
  pushfirst!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  pushfirst!(df::DataFrame, row::DataFrameRow; cols, promote)
  pushfirst!(df::DataFrame, row; promote)

I hit <tab> and it autocompletes to push:

df |> push
  push!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  push!(df::DataFrame, row::DataFrameRow; cols, promote)
  push!(df::DataFrame, row; promote)
  pushfirst!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  pushfirst!(df::DataFrame, row::DataFrameRow; cols, promote)
  pushfirst!(df::DataFrame, row; promote)

Now I type f:

df |> pushf
  pushfirst!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  pushfirst!(df::DataFrame, row::DataFrameRow; cols, promote)
  pushfirst!(df::DataFrame, row; promote)

I press <tab> again and the name fully fills out, including the unfixed argument and placing the cursor after the comma. Now I see:

df |> pushfirst!(_, )
  pushfirst!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  pushfirst!(df::DataFrame, row::DataFrameRow; cols, promote)
  pushfirst!(df::DataFrame, row; promote)

and the list all but disappears as I fill out the rest of the arguments.

df |> pushfirst!(_, [1, 2])
  pushfirst!(df::DataFrame, row; promote)

The autocomplete has assisted me in finding the method I was looking for, enabling me to search for methods which specialize on its concrete type.

Where m is one of the methods returned by calling methodswith, this simple example uses only m.sig and doesn’t sort by argument position at all. However, it could be imagined to do so.

In addition, it could be imagined to use m.module to sort methods by what module defined them, showing first the methods defined in the same module as this object (using parentmodule(MyType)); m.file to find only the methods which were defined in the same file as this object; or any of the other properties of a Method to return better search results. (and the autocomplete could have a suite of hotkeys, or a settings panel, or some settings popup dialog, to determine how it searches.) It could also use statistical inference based on function call data from GitHub, or even your personal use data, to return better search results.

I hope I don’t have to write my own autocomplete, and I’m totally incompetent, but if I’m pushed hard enough…

1 Like

Thanks for the detailed comparison!

I’ll admit that I looked at the OP and thought “oh more chaining/anon function stuff” and jumped to conclusions, apparently without reading the OP properly :sweat_smile:

So the significant underlying differences here weren’t exactly intentional! But might be an interesting alternative take.

2 Likes

Just some thoughts, it’s been a long time since I used Common Lisp but if I remember correctly you can actually replace the parser at run time with something else. There’s something called the “readtable” and you can make “read” do basically anything… This can be super useful for constructing entire honest to goodness domain specific languages which are not a subset of the basic syntax of common lisp. Similarly I could imagine something like:

@with_reader foo ...

and in the ‘…’ it would actually call the reading function foo to read and parse the stuff in …

i’m imagining that JuliaSyntax might actually enable that sort of thing to happen? Whether that’s a good idea or not is up in the air :sweat_smile:

Yes, in fact I might be inclined to say it’s better! I think it seems to sacrifice some of the most powerful functor-currying-partial-application power of the original proposal, but in return it gets to be a lot simpler and more obvious, and probably easier on the compiler, while still satisfying probably the majority of use cases. I think we can still do things like map(/> myfunc(myparam), myvec)

1 Like

Counterproposal, rewriting yours:

Example Possible Autocomplete Behavior

Let’s create an object df = DataFrame(a=1, b=2). When I type

df |> 

I should see a (very very long) list of methods appear: those which specialize on typeof(df), followed by methods which specialize on supertype(typeof(df)), followed by supertype(supertype(typeof(df))), etc., sorted by degree of specialization. The list is about two thousand entries long, something like this:

df |>
  append!(DataFrame,
  copy(DataFrame;

  ⋮ other methods of `DataFrame`
  ⋮ (vdots here to shorten my explanation)

  Array(AbstractDataFrame)
  ==(AbstractDataFrame,
  (Matrix)(AbstractDataFrame

  ⋮ other methods of `AbstractDataFrame`

  ArgumentError(msg)
  AssertionError(msg)
  BoundsError(a)

  ⋮ other methods of `Any`

And then I can scroll down the list to find what I’m looking for. One neuron fires in my brain and I remember that the first character is a p. So I type p and I see:

df |> p
  pop!(DataFrame)
  popat!(DataFrame,
  popfirst!(DataFrame)
  prepend!(DataFrame,
  push!(DataFrame,
  pushfirst!(DataFrame,

I think maybe for first level we should show two levels, typeof(df), but not followed by supertype(typeof(df)) rather disregarding the distinction (unless the supertype is Any?), and order alphabetically as one group?

2 Likes

Okay so since working through that example I basically did all the work needed to figure out how to it anyway, I threw together a quick and simple script that does what I described for a simple autocomplete (you knew I would, didn’t you :triumph:)

Code here

function propose_method_autocompletions(obj, func_name_fragment::String=""; 
    type_depth = 1:typemax(Int), only_same_module = false, 
    prioritize_firstlast = false, github_inference = false, personalized_inference = false)::Vector{Method}

    @assert first(type_depth) ≥ 1 && last(type_depth ≥ 1)
    recs = Vector{Method}[]
    get_type_at_depth(type, depth=1) = depth == 1 ? type : type == Any ? nothing : get_type_at_depth(supertype(type), depth-1)

    for i ∈ type_depth
        stype = get_type_at_depth(typeof(obj), i)
        isnothing(stype) && break

        meths = filter(only_same_module ? methodswith(stype, parentmodule(typeof(obj))) : methodswith(stype)) do m
            length(func_name_fragment) > length(string(m.name)) && return false
            string(m.name)[1:length(func_name_fragment)] == func_name_fragment
        end

        prioritize_firstlast || true # do cool sorting stuff, add later
        github_inference || true # do cool sorting stuff, add later
        personalized_inference || true # do cool sorting stuff, add later

        recs = [recs; meths]
    end

    recs
end

To invoke:

using DataFrames
df = DataFrame(a=1, b=2)

propose_method_autocompletions(df)

By default, it returns all available methods that can act on object df. As you start to type in a function name, the list gets narrowed down quickly:

propose_method_autocompletions(df, "p")
propose_method_autocompletions(df, "pu")
propose_method_autocompletions(df, "pushfirst!")

You can also change the search depth. By default, it searches specializations on typeof(df), as well as supertype(typeof(df)), supertype(supertype(typeof(df))), and so on. This can be changed like so:

propose_method_autocompletions(df, "p"; type_depth=1:2)

This gives methods of DataFrame as well as AbstractDataFrame, but not Any. The iterator can also be reversed 2:-1:1 to show the abstract type’s methods first.

You can also restrict the search to only methods that are defined in the same module as DataFrame:

propose_method_autocompletions(df, "p"; only_same_module = true)

And, one day, it’ll work with the arg_order_inference, github_inference and personalized_inference options too. :wink:

Someday it could be interesting to make it feed its results into a Channel, so that it can offer results with greater immediacy.

Unfortunately it doesn’t solve *my* problem, because the module didn’t export its methods so they don’t appear in methodswith :sweat_smile:

Oddly, when calling methodswith(DataFrame, DataFrames) (trying this to see if it improves speed over filtering *after* the methodswith call), it simply doesn’t return a bunch of methods of DataFrame that are indeed declared by the DataFrames module. For example, pushfirst! is missing. Strange.

3 Likes

Yes I see. It’s very funny that we’ve done the opposite things here in terms of precedence, but with the end result being almost the same from the user’s perspective :laughing:

I think /> having high precedence would be surprising to anyone who’s trying to use it for piping. For example, what does this mean?

x /> Mod1.g[1].h(y)(z)
  • Low precedence: Mod1.g[1].h(y)(x, z)
  • High precedence: Mod1.g[1].h(x,y)(z) … presumably?

Couple of thoughts here:

  • Is there a flexible lowering which might allow pipelines of /> to be optimized more completely. For example, turning x \> map(_^2) \> reduce(+) into mapreduce(_^2, +, x). In particular I’m thinking of how transducers are a lot better for compilation than iterators, as described here Comparison to iterators · Transducers.jl and that we could make use of the iterator-transducer duality if we could give the compiler the right insight. @tkf :slight_smile:
  • What about first class pipelines without data such as \> map(isodd) \> filter(_>10)? Well ok I guess these can be lowered to a lambda x -> filter(y->y>10, map(isodd, x))
2 Likes

To answer my own question, yes! The fixbutfirst/fixbutlast lowering already allows this. Adding the following definition to chain will turn a map \> reduce pipeline into a call to mapreduce:

function chain(x, f1::FixButLast{typeof(map)}, f2::FixButLast{typeof(reduce)}, fs...)
    chain(x, fixbutlast(mapreduce, f1.args..., f2.args...; f1.kwargs..., f2.kwargs...), fs...)
end

Presumably this is not the best such lowering for such transformations, as this is a fairly special case. But it does give a tantalizing taste of possibility.

2 Likes

Great, thanks for the implementation!

A nice thing is that such autocomplete can be added without any changes to julia syntax: just expand it to obj |> it->func(<cursor here>, it) instead of obj |> func(<cursor here>, _). So, it can already be implemented in a package using eg ReplMaker.

I tried your propose_method_autocompletions, and some kind of sorting would definitely help.
For a simple table, tbl = [(a=1,)], the first suggestions without typing the method name are:

ulia> propose_method_autocompletions(tbl)
[1] ArgumentError(msg) in Core at boot.jl:325
[2] AssertionError(msg) in Core at boot.jl:341
[3] BoundsError(a) in Core at boot.jl:277
[4] BoundsError(a, i) in Core at boot.jl:278
[5] ConcurrencyViolationError(msg) in Core at boot.jl:290
[6] DomainError(val) in Core at boot.jl:296
[7] DomainError(val, msg) in Core at boot.jl:297
[8] ErrorException(msg) in Core at boot.jl:267
[9] InexactError(f::Symbol, T, val) in Core at boot.jl:318
[10] InitError(mod::Symbol, error) in Core at boot.jl:354
[11] InitError(mod, error) in Core at boot.jl:354
[12] LineNumberNode(l::Int64, f) in Core at boot.jl:407
[13] LoadError(file::AbstractString, line::Int64, error) in Core at boot.jl:348
[14] LoadError(file, line, error) in Core at boot.jl:348
[15] MethodError(f, args) in Core at boot.jl:338
[16] MethodError(f, args, world::UInt64) in Core at boot.jl:335
[17] (::Type{T})(itr) where T<:Tuple in Base at tuple.jl:317
[18] NamedTuple(itr) in Base at namedtuple.jl:123
[19] OverflowError(msg) in Core at boot.jl:321
[20] Pair(a, b) in Core at boot.jl:825
...

None of these methods are commonly used on tables/collections.
Typing propose_method_autocompletions(tbl, "push") does find the push! function of course, but still doesn’t show the expected push!(tbl, ...) method among the first suggestions. The first is push!(s::Set, x), so it wants to put tbl as the 2nd argument.

Autocomplete based on the ideas from this topic can already be implemented in a package, and would be great if it turns out actually useful! But for now I’m skeptical, given a huge amount of methods specifying few to no argument types.

3 Likes