Fixing the Piping/Chaining Issue

I had previously shared these thoughts about more sophisticated autocomplete.

However, it looks like even those ideas don’t perfectly cover the use case I currently have in mind!

The Problem:

At the moment, I’m working with a Julia package that somebody else has developed. It’s a homebrew Julia implementation of a decent-sized API that has been professionally developed for other languages (C#, Java, Python). I had considered either using the jcall library, or writing such an API myself, until I found this package; the package has over 3,000 lines so I’d really like to leverage what has already been done.

Even though it’s homebrew, it’s surprisingly well-maintained. However, like anything homebrew, it’s poorly documented. The author trusts that you will be familiar with the existing API. Unfortunately, I am not.

That said, the author appreciates Julia’s functional style, so some things are slightly different from the official API. For example, the member methods provided by the official API’s EClient object are instead globally-accessible methods specializing on a ::Connection object. Some of these methods retain their original camelCase, while some became snake_case.

All this means that, to use it, I must refer to the official API documentation (in a different language), while acknowledging that some things will be different, and while having no method discoverability in my IDE. One must be a diehard [and irrational] Julia enthusiast not to jump ship for Python.

So would an autocomplete work with it?

Using the ideas I proposed above, mostly yes. The methods that had previously been members of EClient in the official API are now specialized to take a ::Connection object as their first argument in the Julia package, so those should be discoverable.

However, quite a few functions have been written without type specialization, presumably in part because the package author didn’t export them and can rely on module encapsulation. These will prove more difficult. And as this is not the sort of library that will have a large presence of users on GitHub, there won’t be good data to draw from for statistical inference.

Is there a way to improve that?

I think there’s a way to solve that problem too, even if imperfectly. When you have an object of type MyType which has been defined in package MyPackage, it’s fairly likely that at least some methods that have been defined in MyPackage will have been written to operate on MyType objects (even if the package author hasn’t given them type annotations).

As a result, when searching for methods, it seems a fairly sensible starting point to look first in the package where the type you are working with was defined. It could be an option in the autocomplete whether to include non-exported methods or not.

My IDE is [usually] able to bring me to the line of code where a particular function or type is defined, so this seems doable.

In short, it seems that once you tell your IDE what objects you’re working with, and you’ve communicated to it that you’re about to call a function on them, it should be able to help you find methods, searching through some combination of:

  1. Type specialization
  2. Statistical inference
  3. Package co-location / code proximity

On the surface, there is quite a bit of difference between the original proposal (OP) and the JuliaSyntax PR proposal (JS). Perhaps it works out that the overall behavior is roughly the same—I haven’t figured that out yet.

Let’s take a look at an example from JS. This expression,

x  />  f(y)  \>  g(z)

is parsed as

# S-expression pseudo-code:
(chain x (/> f y) (\> g z))

which gets lowered to

chain(x, fixbutfirst(f, y), fixbutlast(g, z))

So, in JS, /> and \> are effectively unary operators that take in a function call on the right-hand side (RHS). They do not operate on the code on their left-hand side (LHS). This is in contrast to OP, where the front/back fix operators are binary operators that operate on the LHS and the RHS.

Another difference to note is that JS has an implicit piping operation built in (piping is not currying!), which is expressed by the chain function in the parsed and lowered code. OP on the other hand, does not exactly have a piping semantic, although it kind of sneaks in by the way that function calls and front/back fix operators are parsed.

The above expression is parsed rather differently in OP. OP would produce the following after parsing:

((x  />  f)(y))  \>  g)(z) 

Let’s re-write that using S-expressions, for easier comparison with JS:

((((/> x f) y) \> g) z)

That’s quite a bit different from what JS is doing. Although we should be careful to note that /> is different in OP and JS (more on that later).

Another important difference is that when you are creating a Fix* type, OP produces nested types, whereas JS produces just one type. Consider this function:

foo(x, y, z) = x, y, z

If we want to fix the first two arguments, we would do this in OP:

:b /> :a /> foo

which returns this:

FixFirst((FixFirst(foo, :a), :b)

If we want to fix the first two arguments in JS, we would do this:

\> f(:a, :b)

which calls this:

fixbutlast(f, :a, :b)

which produces a single (non-nested) FixButLast object.

Actually, this example shows that, in JS, \> is treated basically opposite to the way it is treated in OP. In JS, (/> f x) means “fix every argument except the first”, whereas in OP, (/> x f) means “fix only the first argument, and leave all the others free”. So the JS implementation actually behaves a lot more like “front pipe” and “back pipe” than like “front fix” and “back fix”. I think it would be semantically cleaner to just have “front pipe” and “back pipe” than to have “fix every argument but the first” and “fix every argument but the last”, with an implicit chain thrown in.

3 Likes

That’s great and all, but I find it quite hard to guess the usefulness of potential autocomplete without actual examples (“for objects of this type in the last position, autocompleted functions would be: …”) or even better, a basic implementation. Something simple like propose_autocompletions(func, obj, narg)::Vector{Method} would help everyone to judge how useful autocomplete could potentially be.

Thanks for the exposition, this makes sense. and to be honest the JS version seems a little simpler to me for just about the same effect.

I just want to be clear though, am I reading it correctly that

chain(x, fixbutfirst(f, y), fixbutlast(g, z))

and

((((/> x f) y) \> g) z)

Are both going to evaluate to?

g(z, f(x,y))

That is, when evaluated “all the way” to a concrete function call, the OP and JS proposals will return the same value for any sequence of \> and /> pipes, and the main differences are in which intermediate functors get constructed?

I think that’s correct. At least I haven’t found a counter-example yet. (Aside from the different intermediate functors that you mentioned, which can of course be concretely realized if you explicitly want to save a partially applied function.)

1 Like

Okay I guess I misunderstood what you meant by “actual real-life example.”

The best examples will not be of methods or types from Base, but from packages that create complicated objects with many tightly-specialized methods, for which method discovery will be most appreciated.

I’ll use DataFrames.jl as an example. Keep in mind, this is just a simple example of how autocomplete could behave, solely by acting on types.

Example Possible Autocomplete Behavior

Here's a walked-through example of how an autocomplete *could* work with underscore syntax. (click to expand)

Let’s create an object df = DataFrame(a=1, b=2). When I type

df |> 

I should see a (very very long) list of methods appear: those which specialize on typeof(df), followed by methods which specialize on supertype(typeof(df)), followed by supertype(supertype(typeof(df))), etc., sorted by degree of specialization. The list is about two thousand entries long, something like this:

df |>
  append!(df1::DataFrame, df2::AbstractDataFrame; cols, promote)
  append!(df::DataFrame, table; cols, promote)
  copy(df::DataFrame; copycols)

  ⋮ other methods of `DataFrame`
  ⋮ (vdots here to shorten my explanation)

  Array(df::AbstractDataFrame)
  ==(df1::AbstractDataFrame, df2::AbstractDataFrame)
  (Matrix)(df::AbstractDataFrame)

  ⋮ other methods of `AbstractDataFrame`

  ArgumentError(msg)
  AssertionError(msg)
  BoundsError(a)

  ⋮ other methods of `Any`

The fact that we have underscore syntax in the language means I can call any of these methods conveniently using the pipe operator. The list was simply created by calling methodswith of the type and its supertypes, with no attention paid to argument position.

Pressing CTRL+B (or something, some hotkey combination) might change settings. For example, maybe I want to see only methods that act on abstract types, in which case pressing CTRL+B could bring up:

df |>
  Array(df::AbstractDataFrame)
  ==(df1::AbstractDataFrame, df2::AbstractDataFrame)
  (Matrix)(df::AbstractDataFrame)

  ⋮ other methods of `AbstractDataFrame`

  ArgumentError(msg)
  AssertionError(msg)
  BoundsError(a)

  ⋮ other methods of `Any`

But for now, I decide I want to see methods specialized to strictly this concrete type. So I press CTRL+B again and I see:

df |>
  append!(df1::DataFrame, df2::AbstractDataFrame; cols, promote)
  append!(df::DataFrame, table; cols, promote)
  copy(df::DataFrame; copycols)
  delete!(df::DataFrame, inds)
  deleteat!(df::DataFrame, inds::InvertedIndex)
  deleteat!(df::DataFrame, inds::AbstractVector{Bool})

  ⋮

And then I can scroll down the list to find what I’m looking for. One neuron fires in my brain and I remember that the first character is a p. So I type p and I see:

df |> p
  pop!(df::DataFrame)
  popat!(df::DataFrame, i::Integer)
  popfirst!(df::DataFrame)
  prepend!(df1::DataFrame, df2::AbstractDataFrame; cols, promote)
  prepend!(df::DataFrame, table; cols, promote)
  push!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  push!(df::DataFrame, row::DataFrameRow; cols, promote)
  push!(df::DataFrame, row; promote)
  pushfirst!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  pushfirst!(df::DataFrame, row::DataFrameRow; cols, promote)
  pushfirst!(df::DataFrame, row; promote)

The list is now sufficiently short that I can see the whole thing and remind myself that the function I wanted to call was pushfirst!, and I type u. Now I see:

df |> pu
  push!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  push!(df::DataFrame, row::DataFrameRow; cols, promote)
  push!(df::DataFrame, row; promote)
  pushfirst!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  pushfirst!(df::DataFrame, row::DataFrameRow; cols, promote)
  pushfirst!(df::DataFrame, row; promote)

I hit <tab> and it autocompletes to push:

df |> push
  push!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  push!(df::DataFrame, row::DataFrameRow; cols, promote)
  push!(df::DataFrame, row; promote)
  pushfirst!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  pushfirst!(df::DataFrame, row::DataFrameRow; cols, promote)
  pushfirst!(df::DataFrame, row; promote)

Now I type f:

df |> pushf
  pushfirst!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  pushfirst!(df::DataFrame, row::DataFrameRow; cols, promote)
  pushfirst!(df::DataFrame, row; promote)

I press <tab> again and the name fully fills out, including the unfixed argument and placing the cursor after the comma. Now I see:

df |> pushfirst!(_, )
  pushfirst!(df::DataFrame, row::Union{AbstractDict, NamedTuple}; cols, promote)
  pushfirst!(df::DataFrame, row::DataFrameRow; cols, promote)
  pushfirst!(df::DataFrame, row; promote)

and the list all but disappears as I fill out the rest of the arguments.

df |> pushfirst!(_, [1, 2])
  pushfirst!(df::DataFrame, row; promote)

The autocomplete has assisted me in finding the method I was looking for, enabling me to search for methods which specialize on its concrete type.

Where m is one of the methods returned by calling methodswith, this simple example uses only m.sig and doesn’t sort by argument position at all. However, it could be imagined to do so.

In addition, it could be imagined to use m.module to sort methods by what module defined them, showing first the methods defined in the same module as this object (using parentmodule(MyType)); m.file to find only the methods which were defined in the same file as this object; or any of the other properties of a Method to return better search results. (and the autocomplete could have a suite of hotkeys, or a settings panel, or some settings popup dialog, to determine how it searches.) It could also use statistical inference based on function call data from GitHub, or even your personal use data, to return better search results.

I hope I don’t have to write my own autocomplete, and I’m totally incompetent, but if I’m pushed hard enough…

1 Like

Thanks for the detailed comparison!

I’ll admit that I looked at the OP and thought “oh more chaining/anon function stuff” and jumped to conclusions, apparently without reading the OP properly :sweat_smile:

So the significant underlying differences here weren’t exactly intentional! But might be an interesting alternative take.

2 Likes

Just some thoughts, it’s been a long time since I used Common Lisp but if I remember correctly you can actually replace the parser at run time with something else. There’s something called the “readtable” and you can make “read” do basically anything… This can be super useful for constructing entire honest to goodness domain specific languages which are not a subset of the basic syntax of common lisp. Similarly I could imagine something like:

@with_reader foo ...

and in the ‘…’ it would actually call the reading function foo to read and parse the stuff in …

i’m imagining that JuliaSyntax might actually enable that sort of thing to happen? Whether that’s a good idea or not is up in the air :sweat_smile:

Yes, in fact I might be inclined to say it’s better! I think it seems to sacrifice some of the most powerful functor-currying-partial-application power of the original proposal, but in return it gets to be a lot simpler and more obvious, and probably easier on the compiler, while still satisfying probably the majority of use cases. I think we can still do things like map(/> myfunc(myparam), myvec)

1 Like

Counterproposal, rewriting yours:

Example Possible Autocomplete Behavior

Let’s create an object df = DataFrame(a=1, b=2). When I type

df |> 

I should see a (very very long) list of methods appear: those which specialize on typeof(df), followed by methods which specialize on supertype(typeof(df)), followed by supertype(supertype(typeof(df))), etc., sorted by degree of specialization. The list is about two thousand entries long, something like this:

df |>
  append!(DataFrame,
  copy(DataFrame;

  ⋮ other methods of `DataFrame`
  ⋮ (vdots here to shorten my explanation)

  Array(AbstractDataFrame)
  ==(AbstractDataFrame,
  (Matrix)(AbstractDataFrame

  ⋮ other methods of `AbstractDataFrame`

  ArgumentError(msg)
  AssertionError(msg)
  BoundsError(a)

  ⋮ other methods of `Any`

And then I can scroll down the list to find what I’m looking for. One neuron fires in my brain and I remember that the first character is a p. So I type p and I see:

df |> p
  pop!(DataFrame)
  popat!(DataFrame,
  popfirst!(DataFrame)
  prepend!(DataFrame,
  push!(DataFrame,
  pushfirst!(DataFrame,

I think maybe for first level we should show two levels, typeof(df), but not followed by supertype(typeof(df)) rather disregarding the distinction (unless the supertype is Any?), and order alphabetically as one group?

2 Likes

Okay so since working through that example I basically did all the work needed to figure out how to it anyway, I threw together a quick and simple script that does what I described for a simple autocomplete (you knew I would, didn’t you :triumph:)

Code here

function propose_method_autocompletions(obj, func_name_fragment::String=""; 
    type_depth = 1:typemax(Int), only_same_module = false, 
    prioritize_firstlast = false, github_inference = false, personalized_inference = false)::Vector{Method}

    @assert first(type_depth) ≥ 1 && last(type_depth ≥ 1)
    recs = Vector{Method}[]
    get_type_at_depth(type, depth=1) = depth == 1 ? type : type == Any ? nothing : get_type_at_depth(supertype(type), depth-1)

    for i ∈ type_depth
        stype = get_type_at_depth(typeof(obj), i)
        isnothing(stype) && break

        meths = filter(only_same_module ? methodswith(stype, parentmodule(typeof(obj))) : methodswith(stype)) do m
            length(func_name_fragment) > length(string(m.name)) && return false
            string(m.name)[1:length(func_name_fragment)] == func_name_fragment
        end

        prioritize_firstlast || true # do cool sorting stuff, add later
        github_inference || true # do cool sorting stuff, add later
        personalized_inference || true # do cool sorting stuff, add later

        recs = [recs; meths]
    end

    recs
end

To invoke:

using DataFrames
df = DataFrame(a=1, b=2)

propose_method_autocompletions(df)

By default, it returns all available methods that can act on object df. As you start to type in a function name, the list gets narrowed down quickly:

propose_method_autocompletions(df, "p")
propose_method_autocompletions(df, "pu")
propose_method_autocompletions(df, "pushfirst!")

You can also change the search depth. By default, it searches specializations on typeof(df), as well as supertype(typeof(df)), supertype(supertype(typeof(df))), and so on. This can be changed like so:

propose_method_autocompletions(df, "p"; type_depth=1:2)

This gives methods of DataFrame as well as AbstractDataFrame, but not Any. The iterator can also be reversed 2:-1:1 to show the abstract type’s methods first.

You can also restrict the search to only methods that are defined in the same module as DataFrame:

propose_method_autocompletions(df, "p"; only_same_module = true)

And, one day, it’ll work with the arg_order_inference, github_inference and personalized_inference options too. :wink:

Someday it could be interesting to make it feed its results into a Channel, so that it can offer results with greater immediacy.

Unfortunately it doesn’t solve *my* problem, because the module didn’t export its methods so they don’t appear in methodswith :sweat_smile:

Oddly, when calling methodswith(DataFrame, DataFrames) (trying this to see if it improves speed over filtering *after* the methodswith call), it simply doesn’t return a bunch of methods of DataFrame that are indeed declared by the DataFrames module. For example, pushfirst! is missing. Strange.

3 Likes

Yes I see. It’s very funny that we’ve done the opposite things here in terms of precedence, but with the end result being almost the same from the user’s perspective :laughing:

I think /> having high precedence would be surprising to anyone who’s trying to use it for piping. For example, what does this mean?

x /> Mod1.g[1].h(y)(z)
  • Low precedence: Mod1.g[1].h(y)(x, z)
  • High precedence: Mod1.g[1].h(x,y)(z) … presumably?

Couple of thoughts here:

  • Is there a flexible lowering which might allow pipelines of /> to be optimized more completely. For example, turning x \> map(_^2) \> reduce(+) into mapreduce(_^2, +, x). In particular I’m thinking of how transducers are a lot better for compilation than iterators, as described here Comparison to iterators · Transducers.jl and that we could make use of the iterator-transducer duality if we could give the compiler the right insight. @tkf :slight_smile:
  • What about first class pipelines without data such as \> map(isodd) \> filter(_>10)? Well ok I guess these can be lowered to a lambda x -> filter(y->y>10, map(isodd, x))
2 Likes

To answer my own question, yes! The fixbutfirst/fixbutlast lowering already allows this. Adding the following definition to chain will turn a map \> reduce pipeline into a call to mapreduce:

function chain(x, f1::FixButLast{typeof(map)}, f2::FixButLast{typeof(reduce)}, fs...)
    chain(x, fixbutlast(mapreduce, f1.args..., f2.args...; f1.kwargs..., f2.kwargs...), fs...)
end

Presumably this is not the best such lowering for such transformations, as this is a fairly special case. But it does give a tantalizing taste of possibility.

2 Likes

Great, thanks for the implementation!

A nice thing is that such autocomplete can be added without any changes to julia syntax: just expand it to obj |> it->func(<cursor here>, it) instead of obj |> func(<cursor here>, _). So, it can already be implemented in a package using eg ReplMaker.

I tried your propose_method_autocompletions, and some kind of sorting would definitely help.
For a simple table, tbl = [(a=1,)], the first suggestions without typing the method name are:

ulia> propose_method_autocompletions(tbl)
[1] ArgumentError(msg) in Core at boot.jl:325
[2] AssertionError(msg) in Core at boot.jl:341
[3] BoundsError(a) in Core at boot.jl:277
[4] BoundsError(a, i) in Core at boot.jl:278
[5] ConcurrencyViolationError(msg) in Core at boot.jl:290
[6] DomainError(val) in Core at boot.jl:296
[7] DomainError(val, msg) in Core at boot.jl:297
[8] ErrorException(msg) in Core at boot.jl:267
[9] InexactError(f::Symbol, T, val) in Core at boot.jl:318
[10] InitError(mod::Symbol, error) in Core at boot.jl:354
[11] InitError(mod, error) in Core at boot.jl:354
[12] LineNumberNode(l::Int64, f) in Core at boot.jl:407
[13] LoadError(file::AbstractString, line::Int64, error) in Core at boot.jl:348
[14] LoadError(file, line, error) in Core at boot.jl:348
[15] MethodError(f, args) in Core at boot.jl:338
[16] MethodError(f, args, world::UInt64) in Core at boot.jl:335
[17] (::Type{T})(itr) where T<:Tuple in Base at tuple.jl:317
[18] NamedTuple(itr) in Base at namedtuple.jl:123
[19] OverflowError(msg) in Core at boot.jl:321
[20] Pair(a, b) in Core at boot.jl:825
...

None of these methods are commonly used on tables/collections.
Typing propose_method_autocompletions(tbl, "push") does find the push! function of course, but still doesn’t show the expected push!(tbl, ...) method among the first suggestions. The first is push!(s::Set, x), so it wants to put tbl as the 2nd argument.

Autocomplete based on the ideas from this topic can already be implemented in a package, and would be great if it turns out actually useful! But for now I’m skeptical, given a huge amount of methods specifying few to no argument types.

3 Likes

Oops! It appears that calling methodswith on a parameterized type doesn’t give the methods of the unparameterized type! My autocomplete was failing to find any methods of Vector because it was being parameterized with the type of its contents.

See this example.

julia> struct MyThing{T} end

julia> foo(::MyThing) = 1
foo (generic function with 1 method)

julia> methodswith(MyThing)
[1] foo(::MyThing) in Main at REPL[157]:1

julia> methodswith(MyThing{1})


julia> bar(::MyThing{1}) = 2
bar (generic function with 1 method)

julia> methodswith(MyThing)
[1] foo(::MyThing) in Main at REPL[157]:1

julia> methodswith(MyThing{1})
[1] bar(::MyThing{1}) in Main at REPL[160]:1

As a result, because an object such as [(a=1,)] is of type Vector{NamedTuple{(:a,), Tuple{Int64}}}, none of the methods of Vector appear.

I’ve rewritten it so it’ll detect if there are no methods of the parameterized type, and if so, give methods of the unparameterized type (with special case handling for vectors). I think it’ll have poor behavior in some particular case, but for a quick hack I think it’s not too bad. Might consider doing something more sophisticated later. find methods of the parameterized type, and append methods of the unparameterized type with special case handling for vectors and matrices.

Here's the code now.

function propose_method_autocompletions(obj, func_name_fragment::String=""; 
    type_depth = 1:typemax(Int), only_same_module = false, only_imported_modules = false,
    arg_order_inference = false, github_inference = false, personalized_inference = false)::Vector{Method}

    @assert first(type_depth) ≥ 1 && last(type_depth) ≥ 1
    recs = Vector{Method}[]
    get_type_at_depth(type, depth=1) = depth ≤ 1 ? type : type == Any ? nothing : get_type_at_depth(supertype(type), depth-1)

    for i ∈ type_depth
        stype = get_type_at_depth(typeof(obj), i)
        isnothing(stype) && break

        meths = methodswith(stype)
        if !(stype isa UnionAll) && length(stype.parameters) > 0
            if stype <: AbstractVector # special case handling for vectors
                meths = [meths; methodswith(getfield(parentmodule(stype), nameof(stype)){T,1} where T)]
            elseif stype <: AbstractMatrix
                meths = [meths; methodswith(getfield(parentmodule(stype), nameof(stype)){T,2} where T)]
            end
            meths = [meths; methodswith(getfield(parentmodule(stype), nameof(stype)))]
        end

        meths = filter(meths) do m
            length(func_name_fragment) > length(string(m.name)) && return false
            string(m.name)[1:length(func_name_fragment)] == func_name_fragment || return false
            only_same_module && (parentmodule(typeof(obj)) == m.module || return false)
            # How to detect only modules that have been explicitly imported?
            true
        end

        arg_order_inference || true # do cool sorting stuff, add later
        github_inference || true # do cool sorting stuff, add later
        personalized_inference || true # do cool sorting stuff, add later

        recs = [recs; meths]
    end

    recs
end

Calling it on the object of [(a=1,)]:


julia> propose_method_autocompletions([(a=1,)])
[1] CapturedException(ex, bt_raw::Vector) in Base at task.jl:12
[2] \(A::LinearAlgebra.Transpose{<:Complex, <:LinearAlgebra.Hermitian{<:Complex, <:SparseArrays.AbstractSparseMatrixCSC}}, B::Vector) in SparseArrays at C:\Users\unime\AppData\Local\Programs\Julia-1.8.0\share\julia\stdlib\v1.8\SparseArrays\src\linalg.jl:872
[3] \(L::SuiteSparse.CHOLMOD.FactorComponent, b::Vector) in SuiteSparse.CHOLMOD at C:\Users\unime\AppData\Local\Programs\Julia-1.8.0\share\julia\stdlib\v1.8\SuiteSparse\src\cholmod.jl:1521
[4] append!(a::Vector, items::AbstractVector) in Base at array.jl:1105
[5] deleteat!(a::Vector, i::Integer) in Base at array.jl:1485
[6] deleteat!(a::Vector, r::AbstractUnitRange{<:Integer}) in Base at array.jl:1491
[7] deleteat!(a::Vector, inds::AbstractVector{Bool}) in Base at array.jl:1590
[8] deleteat!(a::Vector, inds::AbstractVector) in Base at array.jl:1533
[9] deleteat!(a::Vector, inds) in Base at array.jl:1532
[10] empty!(a::Vector) in Base at array.jl:1738

⋮

[3209] groupview(f, X; restype, kwargs...) in FlexiGroups at C:\Users\unime\.julia\packages\FlexiGroups\1ItB2\src\base.jl:41

Beyond that, for truly generic methods that don’t specialize on anything, whose behavior can be thought of as really generalizable across any object, they shouldn’t float to the top of such a simple autocomplete anyway; if a method is general enough not to specialize on any type, it’s probably general enough to have memorized.

It’s possible to make them float to the top anyway, using a sorting technique based on statistical inference of a model fitted to a codebase (imagine if you could train your autocomplete to your own codebase!), but I’m not going to make that at the moment.

Without such inference built-in, for methods to gain visibility, they need to be specialized to such types as AbstractArray, AbstractDict, or etc.

3 Likes

This would be super cool. IMHO compiler tooling should do a lot more of this kind of thing. I want some kind of data-driven approach to the JuliaSyntax.jl diagnostics system. What exactly, I’m not sure yet :slight_smile: I wish for some happy medium between traditional compilers (where all the diagnostic messages are hand-coded at huge engineering effort and are still hard to interpret), vs hugely bloated natural language models which seem hard to train and deploy, but can perform amazing feats of inferring what the user intended.

2 Likes

I think this is a relatively straightforward problem, conceptually anyway. One would wish to collect the frequencies in the codebase with which Tᵢ typed objects are fed to mⱼ methods. Then, when autocompleting methods for a Tᵢ object, simply sort the methods by order of call frequency.

I think this would probably be an optional mode: for example, one mode of autocomplete might be to sort based on type specialization; another mode might be to sort by frequency of use in your personal codebase; another mode might be to sort by frequency of use in all GitHub repos.

I think it’s probably decent to use the GitHub repo method use frequencies as a starting point, and then do some sort of weighted average with the frequencies they’re used in personal use. That way you don’t overfit to your own codebase.

It’s also possible to find the frequencies associated with certain packages being loaded (e.g., somebody who has loaded LinearAlgebra is more likely to call qr), and do a weighted average of the frequencies associated with the packages you’ve loaded in your project to get an autocomplete that’s more likely to be perfect for your specific use case. The possibilities are endless.

Indeed, although as we’ve shown here, compile times for lambdas can get prohibitive if it’s done a lot, sometimes two orders of magnitude greater compile time than the alternative. Much better to have dedicated syntax to make a partial application functor.

It’s also simply cleaner to have dedicated syntax.

I think the piping and the autocomplete are really orthogonal, though some might prefer certain syntax as a UI to autocomplete.

Imagine something like vscode if you could go to the Julia workspaces panel and just right click a variable and ask it to “explore methods for this variable” maybe it pops up a new panel with methods and you can filter them by characters you type into a search box as well as select which modules you want to search etc etc.

4 Likes

Cool idea!

I like that we’ve moved the conversation from “autocomplete is infeasible” to “this is how I would do it.” :wink:

In the same way that . dot-syntax is syntax sugar that motivates tab-complete for property names, a proper chaining syntax motivates tab-complete for method names. Hopefully this should make idiomatic Julian style easier and more accessible.