How to discover functions which apply to a given object?

question
#1

This is, at it’s heart, a question related to user experience, particular with an eye toward rapid prototyping.

For me, one of the things which makes Python so great for rapid prototyping is the ability to easily discover functionality without digging through the source code or the documentation. I can create an object A, hit A.<tab>, and I will be presented with a list of all the sorts of things I might want to do with A. The tab-complete and shift-tab lookup in Jupyter makes it possible to get exactly the part of the documentation I need quickly, and tab-complete with decent naming conventions often removes the need for documentation at all.

With Julia, on the other hand, I often feel lost. If I create some object A there is no way for me to know which functions might apply to A or not. I can’t quickly discover what some package writer might have had in mind for me to do with A, I basically have to search through the documentation, if present, or (more often) the source code. This is sort of a side-effect of not having methods that I wasn’t anticipating.

My question: am I missing some functionality within the Jupyter interface which facilitates a similar level of discoverability?

To keep this short, I will go into more detail in a follow up comment.

#2

You can try the methodswith function.

2 Likes
#3

For example, let’s say I create a set:

A = {1,2,3,4,5}

Now suppose I want to take one element off. Even without consulting the documentation, even if I’ve never used a set in Python before, I know that I can probably find what I need by just hitting A.<tab> and Jupyter will suggest all the useful things I might want to do with a set:

image

There are a very reasonable number of items to search through, and from this alone it is clear that what I want is A.pop(). Of course, features which are used often from the standard library become familiar with time, but for third-party packages this discoverability is paramount. For example, using the graph package networkx, I barely had to consult the documentation to figure out how to use it. Here’s a version of how my thought process went:

“Great, looks like networkx can do graphs! Let’s import it.”
image
“Hmm, how to create a graph? Maybe there is a Graph constructor…”
image
“Perfect! Now, let’s see, do I need to pass it anything?”


“Looks like it’s optional. Quick peek at the examples…”

“OK, so I can create an empty graph and add nodes. Let’s start there…”
image
“OK, now how did I add a node? Was it add_node or addnode or addNode? Let’s check…”
image
“Oh sweet, look at all this stuff I can do! I was going to add nodes one at a time but it looks like I can pass a whole list…”

“Yup, nice! Now how would I go about converting this to a directed graph?”
image
(and so on)

The point is that this was very easy and quick to discover and explore the functionality without consulting the documentation.

With Julia, a side-effect of multi-dispatch is that I never know what functions are intended to apply to a given object. If I make create a set and I don’t remember how to grab an element from it I have to search through the documentation. Even then, there isn’t a section which seems to list “common things you might want to do with sets”, since many of the things you do with sets (count their elements, retrieve a single element) are also done by anything which is a collection, so discoverability has been a challenge.

With third-party packages, this is even worse! With enough use, elements of the standard library will become familiar, however what happens when I try to use the Julia graph package without consulting the documentation? If I want to, I can import the package and look at everything it provides,

image

but this list is huge and I am not going to learn much by searching through it. Applying my previous workflow, I might try this:

image

“OK, so do I need to pass it anything?”
image
“Hmm… OK, not sure. That may be a documentation issue, although simply having some constructor heading would be helpful… Let’s try it anyway.”
image
“Hmm… OK, that’s not helpful. How can I add nodes to this? Maybe there is an add_nodes function like in python?”
image
(hitting <tab>, nothing happens)
“Hmm… So I guess I can either look through all the functions available from LightGraphs or I have to just guess at the name. Maybe it starts with add…”
image
“OK, add_vertices! looks promising! Now how would I go about converting this to a directed graph… Hmmm… Guess I should search the docs or look at the source code…”

5 Likes
#4

Thanks, that is super helpful!

image

However it doesn’t seem to quite do what I’m looking for, depending on the level of type abstractness I think. For example, I had created a LightGraph.Graph and then added a node to it, and I was mentioning it was harder to find the add_nodes! function. Trying to find it with methodswith doesn’t seem to work:

It’s not entirely clear to me why.

In addition, while it is useful, having to type out and run a separate command is not as convenient as simple tab-completion. Is there a package or plans to add this sort of “methodswith” functionality in some interactive way?

1 Like
#5

See the documentation of methodswith, in particular the supertypes keyword.

Also, please don’t post screenshots, quote code instead.

#6

Thanks.

(I was posting screenshots specifically because I wanted to show the tab-completion drop-downs.)

2 Likes
#7

Have a look at

1 Like
#8

At some point I did a prototype of “tuple completion”, basically you would write the arguments of a method in parenthesis (e.g. (x, 1) where x is a Set) and on tab it would show you all the methods with that type signature, with (x,...) being equivalent to methodswith. Completion would then add the method name in front of the tuple, making it a valid function call.

I didn’t find it that useful though (it was a bit buggy admittedly). But since InteractiveCodeSearch seems to provide everything you would need to do that kind of search it wouldn’t be very hard to implement it again in an editor.

You could have a lighter syntax, e.g x) alone would trigger methodswith, like f( triggers methods.

2 Likes
#9

Probably, in time. But keep in mind that this is a tooling issue, and nothing really to do with python or Julia.

2 Likes
#10

True, some of this can be fixed with tooling.

To some extent however it’s a result of the objects-don’t-have-methods approach that Julia takes. In Python, the tab-complete functionality provides a list of the methods attached to a given object, which can be seen as a curated subset of all possible functions which could accept that object as input. The object oriented focus of Python encourages this approach, and the dot to initiate method calls makes the tab-complete quite natural. On the other hand, the output from methodswith seems like, depending on the type structure, it is often either too little information or way too much. For example, with a graph A, I get 0 methods if I don’t use supertypes, and 242 methods if I do. In addition, the nature of the Julia syntax makes a similar tab-complete approach somewhat unnatural, as you wouldn’t really be writing the object first anyway.

I personally find the object-method approach is a more natural way of thinking. Perhaps it is just because that is what I’m used to, but when I am working on a problem I typically know what the object I need to work with is, and by simply typing it and hitting <tab> I am quickly narrowing in my scope on what operations are possible. With Julia, however, as a consequence of syntax, I end up sitting there with a blank cell, an infinite world of possibilities. Another example thought process, recreated after the fact:

I have some graph object A.
“How many nodes are in this graph? Oh, right - multi-dispatch - maybe they overrode the length function…”
image
“Nice! Wait… is that the number of nodes or number of edges? Let me check the documentation…”
image
“Ugh, no that’s just the generic documentation, no help. OK, let’s try something different - how about we get all the nodes first… I bet that’s just a property of the Graph object:”
image
“Hmm, the nodes aren’t here… they must be returned from some function. Maybe nodes?”
image
“Nope. Maybe if I check just the things exported from LightGraphs…”
image
“Ugh, let me go consult the documentation…”
image
“OK, OK, so they call them ‘vertices’, not ‘nodes’…”

I guess the point here is that part of the reason Python is so great for rapid prototyping, because I could have discovered that immediately just by hitting A.<tab> and don’t have to context-switch between code and documentation all the time. I would have looked for nodes, it wouldn’t have been there, but I would only have to quickly scan a short list of items to notice vertices was there, and I would have moved on.

I don’t mean to complain, I’m just offering this up as an example and wondering what process more experienced Julia users would do to discover functionality like this.

1 Like
#11

A further thought: It is interesting that, in Python, tab-complete, method chaining, and (to some extent) conceptual order have played a role in guiding which functions people create as methods and which are created as standalone functions, with the result that typically these lists are short, easy-to-parse, and have exactly the sort of functionality that you’d expect to find.

Grammar is the most interesting: for the sort of things where you think of the action first and then the object, these are represented as standalone functions. These generally apply to actions which can be applied to many different objects (from the standard lib, things like len, zip, from an example third-party module np.save). These are all instances where I think of the action first, and are somewhat canonically named, so they are easy to find.

On the other extreme, object methods are predominantly things where I think of the object first and then the action. The A.nodes() scenario, where A is a graph, is a good example of this. In this instance, I naturally think of the graph object first, and wanting to get the nodes out of it.

Of course nothing prohibits both, as in numpy where most methods have identical standalone functions (eg. a.min() or np.min(a)), however I would argue numpy is potentially an example where making some choices about which functions are better as standalone and which are better as methods would be useful.


The two approaches I think which would mitigate some of these issues are:

  1. Uniform Function Call Syntax
  2. Methods on Objects

The first is just syntactic sugar so that A.funcname() parses to funcname(A). The second could honestly just be a list of methods for the purpose of populating a tab-complete list. Then, if A has a method list, tab-complete could pull just from that list while shift-tab-complete could return a more complete list a-la methodswith.

#12

It looks like Uniform Function Call Syntax was discussed a long time ago, and for some reason people dislike it, however the only reason I see is something about the dot operator being too useful to be simply a synonym for function calls. Does it actually conflict with any other syntax? I don’t think it would conflict with broadcasting, for instance, so if it has not been used for anything else in the last 5 years maybe it is time to consider it?

#13

That’s an understandable opinion, but it is worth considering how much your personal level of experience with Python vs Julia has contributed to what you consider “natural”. In general, we as humans have a tendency to conflate things we personally understand well with things that are “natural” or “intuitive”. I certainly have that tendency.

And, in addition, it’s also worth noting that even in python, if you’re using tab completion to look up methods, then you are missing a huge category of possible operations in python: namely anything which you might describe as a functional programming style. For example, from tab completion on a collection, you would never discover that you can map, reduce, filter, or enumerate that collection or collect it as a list or pass it to any other generic function. Nor would you discover which operators are applicable (for example, python now has a matrix multiplication operator @, but there is no way to discover this through tab completion, except by looking for __at__ methods, which isn’t exactly natural. That’s not really python’s fault, but it shows how the reliance on tab completion for method exploration isn’t even a particularly complete solution in a somewhat object-oriented language like Python.

And, for what it’s worth, at this point I’ve been using Julia long enough that the entire idea of object-oriented anything feels pretty unnatural :wink:

6 Likes
#14

I agree, I may just be used to it.

Still, I don’t see how UFCS and an (optional) method list would hurt things. I mean, which is more readible:

A = Graph()
eigenvalues(laplacian(transitive_reduction(fill_from(A, edge_list))))

or

A = Graph()
A.fill_from(edge_list).transitive_reduction().laplacian().eigenvalues()
#15

On second thought, that’s only partly true. The function chaining above is very much a functional programming style. The particular location of the functions - whether they come before or after their first argument - is really just a syntax choice. Javascript puts the map, filter, reduce commands as methods on the collection object.

UFCS offers the flexibility of both approaches - you can write it in whichever order makes the most sense. This not only helps with readability but also facilitates conceptual flow. Honestly, even if you claimed

eigenvalues(laplacian(transitive_reduction(fill_from(A, edge_list))))

was easier to read, wouldn’t you end up having to write it “inside-out” anyway?

The only downside in my mind to UFCS is that you still don’t get the benefit of having the curated list of methods available. The fact that any function f(x) can also be called as x.f() means that x.<tab> would suggest something along the line of what methodswith returns. As mentioned above, this can jump from 0 returned methods to 242, depending on if you consider supertypes or not, with no gradation or curation.

That’s where the optional list of methods associated with a type would come in. My proposed tab + shift-tab complete interactivity that I described above would then allow you to have the best of both worlds.

True, you wouldn’t discover infix operators this way unless they were also defined as prefix operators… but in Julia that is the case, so we would discover it! The existence of infix operators themselves stand as a testament to the importance of allowing flexible ordering, otherwise we would all be writing things like (+ 3 5) as in Lisp.

Of course there are limitations. An int64, for example, will return a huge list of methods that you wouldn’t want to parse through, but I don’t object to putting some time into learning the standard library. Just because something does not completely replace documentation doesn’t mean it’s useless, however, and even in these situations additional tooling like fuzzy string-matching and extending the search to docstrings can be surprisingly helpful. Bringing the docs closer to the interface and allowing you to zip around them interactively in a way closely related to the code at hand is essential to make rapid prototyping easier.

The changes I’m suggesting are purely syntactic sugar, they don’t change anything fundamental or semantic about the language but I believe they would improve expressivity, usability, and discoverability massively by allowing a flexible natural ordering and empowering standard IDE interactivity like tab-complete in a way which is currently hindered by the rigid choice of syntax.

#16

It has been used for overloading getproperty instead. It is possible to emulate Uniform Function Call Syntax using getproperty.

struct SomeType
    f::Int
end

Base.getproperty(x::SomeType,y::Symbol) = getproperty(x,Val(y))
Base.getproperty(x::SomeType,::Val{:f}) = getfield(x,:f)
function Base.getproperty(x::SomeType,::Val{T}) where T 
    g(z...) = (getfield(@__MODULE__,T)(x,z...))
    return g
end

SomeType(4).print()
SomeType(2).print(4)
SomeType(1).print(4,5)

#printed output

SomeType(4)
SomeType(2)4
SomeType(1)45

You can also write it like this in julia:

A = Graph()
fill_from(A,edge_list) |> transitive_reduction |> laplacian |> eigenvalues
#17

Nice! When I first read your response I thought it would be restricted to UFCS on your specific type only, but Julia allows overwriting of functions so the following sets up UFCS for everything:

import Base.getproperty

function getproperty(Core.@nospecialize(x), f::Symbol)
    try
        getfield(x, f)
    catch e
        if isa(e, ErrorException)
            try
                getfield(@__MODULE__,f) # just to throw an exception if function doesn't exist
            catch
                throw(e)
            end
            (z...) -> getfield(@__MODULE__,f)(x,z...)
        else
            throw(y)
        end
    end
end

A = [3,2,1]
A.sort().print()

# Prints out:
[1, 2, 3]

Unfortunately this doesn’t work with tab completions, but that actually looks like an easy extension!

Nice, although what happens with functions that take multiple arguments? The nice thing about UCFS is that it is often the case that the output from the previous call is the main object of consideration, and the functions are written in such a way that it just works nicely to pass this function in as the first argument. For example, if there was a function connected_components(G,k) which returned the set of k-connected components, UCFS would let me do

A.fill_from(edge_list).connected_components(3)

I feel like, if anything, this piping style

fill_from(A, edge_list) |> connected_components(3)

suggests that the function get curried, so instead I’d probably have to do something like this:

fill_from(A, edge_list) |> x -> connected_components(x, 3)

which gets unruly somewhat quickly.

#18

Modifying tab completion is pretty easy:

import Base.propertynames
function propertynames(x)
    fns = fieldnames(typeof(x))
    mns = unique([s.name for s in methodswith(typeof(x), supertypes=true)])
    (fns...,mns...,)
end

Obviously some (significant) pruning is needed, but the general idea is there. It’s not as clear to me if getting shift+tab to display a list (as with tab completion) is possible without hacking at Jupyter itself, does anyone know about this?

#19

One side affect of redefining Base.getproperty like that globally is that it makes things less inferable.

import Base.getproperty

function getproperty(Core.@nospecialize(x), f::Symbol)
    try
        getfield(x, f)
    catch e
        if isa(e, ErrorException)
            try
                getfield(@__MODULE__,f) # just to throw an exception if function doesn't exist
            catch
                throw(e)
            end
            (z...) -> getfield(@__MODULE__,f)(x,z...)
        else
            throw(y)
        end
    end
end

A = [3,2,1]
k(x) = x.sort()
@code_warntype k(A)

#outputs

@code_warntype k(A)
Body::Any
1 ─ %1 = invoke Base.getproperty(_2::Any, :sort::Symbol)::getfield(Main, Symbol("##3#4")){Array{Int64,1},Symbol}
│   %2 = invoke %1()::Any
└──      return %2

I have tried to come up with a more type stable way of doing this, but doing so globally either gives me segfaults, or stackoverflows, which is not totally unexpected.

The following function overloads getproperty more type stably, but only for a given type

function UFCS(type::Type,mod = @__MODULE__)
    @eval Base.getproperty(x::$type,y::Symbol) = getproperty(x,Val(y))
    for field in fieldnames(Base.unwrap_unionall(type))
         @eval Base.getproperty(x::$type,::Val{$(Meta.quot(field))}) = getfield(x,$(Meta.quot(field)))
    end
    @eval Base.getproperty(x::$type,::Val{T}) where T = (z...) ->  (getfield($mod,T)(x,z...))
end


UFCS(Array)

A = [3,2,1]
k(x) = x.sort()
@code_warntype k(A)

#outputs

Body::Array{Int64,1}
1 ─ %1 = (Base.arraysize)(x, 1)::Int64
│   %2 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Int64,1}, svec(Any, Int64), :(:ccall), 2, Array{Int64,1}, :(%1), :(%1)))::Array{Int64,1}
│   %3 = (Base.arraylen)(x)::Int64
│   %4 = invoke Base.copyto!(%2::Array{Int64,1}, 1::Int64, _2::Array{Int64,1}, 1::Int64, %3::Int64)::Array{Int64,1}
│   %5 = Base.Sort.sort!::typeof(sort!)
│   %6 = invoke Base.Sort.:(#sort!#7)($(QuoteNode(Base.Sort.QuickSortAlg()))::Base.Sort.QuickSortAlg, Base.Sort.isless::Function, Base.Sort.identity::Function, Base.Sort.nothing::Nothing, Base.Sort.Forward::Base.Order.ForwardOrdering, %5::Function, %4::Array{Int64,1})::Array{Int64,1}
└──      return %6

It is best to avoid using this function on any type that does it’s own getproperty overloading

#20

I don’t see why method discovery has to be tied to object.method syntax. Just define some key combination (for example ctrl-.) invoked on an object to return a version of the output of methodswith in a dropdown list. This is a question of tooling.

Concerning which methods should be returned, that would be a question of configuring the scope and sorting that methodswith uses. It would also be nice to keep methods apart from fields and properties, which you don’t get with Python’s .-tab completion.

2 Likes