LightGraphs users (and other interested parties): please opine below

Not quite sure how to narrow down the audience to interested parties, so I’m placing this here.

LightGraphs 2.0 is in active development. One of the big changes is the movement of functions into separate submodules. (In combination with our expanded use of multiple dispatch, this makes for a pretty elegant approach.) However, we have some new issues relating to exports as a result, and I’d like the group’s opinion on the best way to proceed.

In short, we now have common functions that dispatch based on structures: shortest_paths(g, 1, BellmanFord()) vs shortest_paths(g, 1, Dijkstra()), for example. These methods and structs live in the LightGraphs.ShortestPaths submodule.

The problem is what to export from these submodules. My initial feeling is the answer should be “as little as possible, if anything”. The reason this is an issue is name conflicts on some of the structures that describe algorithms. Dijkstra and Tarjan, for example, were prolific graph researchers and contributed to the field in multiple ways, and are therefore defined in multiple submodules. This will cause a conflict if these structs are exported from each module, and the modules are used concurrently (either directly or transitively).

The impact of the decision is one of user-friendliness. That is, in a “export nothing” scenario, users would be moving from

using LightGraphs
...
p = dijkstra_shortest_paths(g, 1)

to

using LightGraphs
using LightGraphs.ShortestPaths: shortest_paths, Dijkstra
...
p = shortest_paths(g, 1, Dijkstra())

or even

using LightGraphs, LightGraphs.ShortestPaths
...
p = ShortestPaths.shortest_paths(g, 1, ShortestPaths.Dijkstra())

So
option 1) export nothing

The second option is to keep the structures unexported, but export the functions/methods. This results in the ability to run the functions with the default algorithms (where they exist), or forcing the user to explicitly import an alternate algorithm.

option 2) export functions/methods only

The final options I can think of are to
option 3) export everything, and let the users deal with any conflicts, or
option 4) export everything and rename everything so that there are no conflicts

Neither of these last 2 options seem to be a good idea. The former may produce conflicts in transitive dependencies that the user is unaware of (LightGraphs.Connectivity depends on LightGraphs.ShortestPaths which depends on LightGraphs.Traversals); the latter will eventually be unsustainable as new algorithms get included in the package.

Opinions please.

  • Option 1: export nothing
  • Option 2: export functions/methods only
  • Option 3: export everything and let users deal with name conflicts
  • Option 4: export everything and rename conflicting structs
  • Option 5: something else (see below)

0 voters

Edited to add: an additional advantage of option 2 is that we have an explicit way to publish the API: that is, exported functions are part of the API; unexported functions aren’t and can change without causing a breakage. Option 1 doesn’t have this ability and we’d have to figure out another way to mark functions as part of the official API.

Feel free to add comments / ask questions below. Thanks for the help.

2 Likes

Export nothing is not a good option because users can already do that: import LightGraphs. So you might as well export your public API both as a way to document it and for helping people do REPL work, and if you really want to say that good habit is to namespace, then just use import in your docs.

8 Likes

I am a fan of not exporting anything, but can’t argue with Chris here. Just be selfish. Export what you want to export for your own workflow and if I don’t like it, I’ll just import (which is what I usually do anyway) :slight_smile:

2 Likes

One advantage of Julia is that the danger of overwriting functions is much less in Julia (because of multiple dispatch) than in (e.g.) Python: if, for some reason, a function with the same name already exists, that is not a big problem unless the existing function takes the same argument type(s). Because of this, I kind of like to not have to write a lengthy package name in the function call.

This is true for functions, but not for structs. They must be uniquely named throughout a given import space.

1 Like

This is probably a dumb question, but why can’t you make a traits module with things like Dijkstra and then use the same type defined in the initial traits module.

I’m not sure I follow. Right now, in LightGraphs.ShortestPaths, we have a type hierarchy for Dijkstra as follows:

Dijkstra <: SSSPAlgorithm <: ShortestPathsAlgorithm <: AbstractGraphAlgorithm

In some other module, we might have

Dijkstra <: SomeOtherAbstractAlgorithm

This allows us to do some fancy dispatch to eke out the best “default out-of-the-box” performance possible.

Edited to add: the two Dijkstra structs may have completely different fields.

Where would traits fit in?

1 Like

I didn’t realize that you were using Dijkstra to hold the parameters. That definitely complicates things.

It always stinks to have name conflicts and stuff, but my opinion is usually “export everything you can without conflicts”. It makes peoples lives easier usually, but of course at some cost. So I am a fan of option 2.

Most end users are casual people, they don’t know your API, the first 5-10 uses of it they are copy pasting from docs and then modifying things to suit their needs to figure things out. So having things accessible is nice for new peoples.

As cool as it may seem to expect users to really memorize our code bases and fully understand the docs before use - it just doesn’t happen in the majority of cases. I think some developers forget what its like to be on the outside of their own API’s. Really glad you made this open to community vote Seth, thats a gutsy way to see a lot of opinions, but shows your intentions are in the right place! Definitely looking forward to v 2.0!

4 Likes

You could make a LightGraphs.Dijkstra with all the fields from the sub-modules structs of the same names, presuming all those fields have distinct names. Then just pull out the fields that are actually needed for the function needed… eg

shortest_paths(x,y,z,algo::LightGraphs.Dijkstra) = shortest_paths(x,y,z,ShortestPaths.Dijkstra(algo.a, algo.b))
community(x,y,z, algo::LightGraphs.Dijkstra) = community(x, y, z, Community.Dijkstra(algo.c, algo.d))

I’m not sure if it’s worth bending over backwards in this way, but it should be possible.

If methods that differ only by the name of keyword arguments were distinct, you could just have a bunch of different methods of Dijkstra with different keyword arguments…

just my 2cents: I try either

  • export names but keep them informative shortest_path(g,1, Djikstra()) is perfectly fine when exported.
  • don’t export but make module name a “part” of the function name, e.g. ShortestPaths.djikstra(g, 1; params...)

I admit that this example doesn’t illustrate its usefulness that well, but my point is that when looking into module ShortestPaths you already expressed that You’re interested in well… finding shortest paths :wink: so no need to repeat that in shortest_path (function name)

1 Like

I looked at the code a bit and here is one more idea.

struct PathParams{A,names,T}
    algo::A
    params::NamedTuple{names,T}
end

function Base.getproperty(pp::PathParams, s::Symbol)
    return getproperty(getfield(pp, :params), s)
end

struct Dijkstra end

function PathParams(algo::Dijkstra, all_paths=false, track_vertices=false, maxsdist=typemax(Float64))
    PathParams( algo, (all_paths=all_paths, track_vertices=track_vertices, maxsdist=maxsdist))
end

function shortest_paths(g::AbstractGraph, srcs::Vector{U}, distmx::AbstractMatrix{T}, alg::PathParams{Dijkstra}) where {T, U<:Integer}
...
end

This gives you one Dijkstra type and shouldn’t require changing the shortest_paths code past the the methods head because the same exact fields are available using getproperty

4 Likes

I actually like this approach quite a bit. I voted for option #2 above, but if only functions/methods are exported, then we would have this:

using LightGraphs

p = shortest_paths(g, 1, ShortestPaths.Dijkstra())

which has one repeat of the “shortest paths” information. Of course if ShortestPaths.Dijkstra is the only Dijkstra you’ll be using, then you can do using LightGraphs.ShortestPaths: Dijkstra to get rid of the redundancy.

This is similar to what languages that lack generic functions do with modules. In Elm, they have separate List and String modules which contain List.map and List.filter, and String.map and String.filter, respectively. That’s not quite as nice as having generic map and filter, but it’s still pretty readable.

Of course the strict analogy in this case would be to have Dijkstra.shortest_paths(), but that’s probably the wrong way to modularize LightGraphs.

OK – does this mean that the constructor method in a struct would have problems if there are two structs with the same name (irrespective of whether the structs are different)?

I don’t think you can have two structs with the same name inside the same namespace:

julia> struct Foo
       x::Int
       end
julia> struct Foo
       y::String
       end
ERROR: invalid redefinition of constant Foo
1 Like

An idea which I don’t know if it’s sensible for LightGraphs: have an exported Dijkstra() function (or type) which, depending on the arguments, creates a ShortestPaths.Dijkstra object or an OtherModule.Dijkstra object.

This means that the arguments for the different Mod.Dijkstra types must be distinct, in particular the no-argument version is available for only one of the types. Or the no-argument Dijkstra() could create an object of a third type which is accepted by both shortest_paths and other_module_algo as a tag for what should be the default parameters for the corresponding Dijkstra algorithm.

My objection to this would be that a user may expect that a capitalized function is a constructor X that creates an object of type X.

Here it doesn’t, X(...) creates an A.X or a B.X, which are unrelated except for the name.

3 Likes

Yes I agree, with this idea I initially thought this could be respected, but the A.X or B.X already have to inherit from something else than A. Maybe non-capital x would do, e.g. dijkstra() etc. ?

1 Like

That’s too close to the original, and unfortunately misses the goal of having a struct used as a dispatch mechanism for a generic shortest_paths() function.

Ok, but then what is the point of ShortestPaths.Dijkstra struct, when it will be used only in ShortestPaths.shortest_path function? Note that by introducing a submodule you express the same intent as writing shortest_path(graph, A::ShortestPathAlgorithm). Will you use the fact that Dijkstra <: ShortestPathAlgorithm elsewhere?

arguably ShortestPaths.dijkstra(graph, params...) sounds/writes much better/looks much clearer than
ShortestPaths.shortest_path(graph, ShortestPaths.Dijkstra(params...)), that is unless everything is exported.


on the other hand you can have both defining

function ShortestPaths.dijkstra(graph, params...)
    return ShortestPaths.shortest_path(graph, ShortestPaths.Dijkstra(params...))
end

so maybe it doesn’t really matter that much :wink: