As title says, what are your favourite ways to deal with functions/types which are central to the functionality of a package but for which the best name carry a high risk of collisions? Concrete examples in my case are functions called name, remove!, inputs, outputs?
Some options I can think of:
Just go ahead an export them anyways
Don’t export them but declare them public API in the documentation and recommend that people import them
Try to rename them into something less prone to collisions
I suppose 1. is less attractive if there are a lot of other exported functions as it becomes cumbersome to import everything.
I’m also a bit uncertain about the value of exporting things. In my mind it helps a little bit with discovery, but tbh I have never really made use of this myself. I know there is some advice out there to just never use using in a module and instead import exactly what is being used or just refer to things by namespace.
I think (1) is generally fine, since it makes interactive use and exploration of the package easier. If a user wants to use your package and another one with similar exported names, then they’ll have to do using Foo: bar, but that’s fine–it’s only as inconvenient as it would already have been if you hadn’t exported those names.
1 and 2 are both fine I would think. For 3, it depends. If the functions act on a type defined in your package, export as will because multiple dispatch will take care of it even if there are collisions. If, however, we are considering functions that act on existing types it is reasonable to worry if there might be collisions.
Even if that were the case, though, one can just use the fully qualified function name (as in, with the module name before it) to disambiguate function calls. Such a collision will most likely be noticed by the user, and it is reasonably easy to fix for them.
In general, I guess it makes sense to settle on an exporting policy. Do you want to export all public API functions? Or, for example if you have a framework, do you only export the ‘entry points’? In my opinion it can be quite annoying and verbose for the user to have to specify the module name for everything, and it can be equally annoying to have to import functions explicitly.
But won’t they have to change using Foo into using Foo: bar, baz, barbaz, etc.. Not the worst thing that can happen if one has decent test coverage though.
Then there is also the accidental breakage from when another package adds a conflicting name. I suppose that if one has users it is quickly discovered and fixed.
Perhaps both are non-issues if one does not use using in modules to begin with. I think I will try that approach in combination with (1) and see how it goes.
I like this option most. I don’t like when some library takes over names that I wanted to use. For example, AlgebraOfGraphics exports data, which is a name I like to use myself.
Hmm, I think that two functions need to be from the same module for this to work. If package Foo defines foo(::Bar) and package Foo2 defines foo(::Baz) they will collide.
Such a collision will most likely be noticed by the user, and it is reasonably easy to fix for them.
I guess it is, but it still is a little bit of an annoyance when it causes breakage when a dependency adds a conflicting name. Example is when a package A has using B and using C and a minor release of C adds an export of the same name as something B already exports. I don’t think users of A can fix this (except for using dev of course).
I suppose I will try to mitigate this for myself by trying the strategy to only use import in the module.
There is still some nagging fear that a package which exports common names becomes that annoying package which you can’t just depend on through using but I feel like I’m a bit far into overthinking it territory now
This is not always the case I think. If you implement some data structure, it makes sense to define size(), copy(), push!() etc because it will be easier or more intuitive for people to use. After all, that is kind of the point of multiple dispatch.
makes sense to define size() , copy() , push!() etc because it will be easier or more intuitive for people to use.
For functions defined in Base this makes perfect sense since pretty much every package depends on Base. However, there is no function Base.data and therefore they will not be the same function.
I suppose someone could make some lightweigh package called CommonNames.jl which only defines the names, but it does seem fully practical (e.g. what goes in there? how to define the semantics? etc…).
Sure, but they’d have to write using Foo: bar, baz etc. anyway if you didn’t export, so there’s no extra work.
Also note that conflicting names will never silently break your code in Julia: if Foo and Bar both export name, then:
using Foo
using Bar
name()
will throw a specific error indicating that name is ambiguous and must be qualified with Foo. or Bar., which is exactly the right thing for a user to do.
This is different than, for example, Python where from Bar import * will silently overwrite any names that were previously imported from Foo.
Yeah, that’s the right question to ask, and it’s why there is no CommonNames.jl. data is a nice name, but it doesn’t have enough of a common meaning to be worth sharing in this way.
If I want to use data for my own purposes, I can do import FooPackage: data as foodata. But this is annoying for me to type and confusing for readers of my code who see foodata everywhere. It would be easier for both of us if the original name were something else.
Sure, but they’d have to write using Foo: bar, baz etc. anyway if you didn’t export, so there’s no extra work.
Sorry for not letting this go, but I feel the dilemma is that using Foo would still be perfectly possible if only one just did not export e.g. name. E.g. typing using Foo followed by using Foo: name is less work than typing using Foo: bar, baz,....
It is of course a bit of an ill defined problem to try to classify which names have high collision risk and judge the number of low collision risk names vs the number of high collision risks and I appreciate your advice to just not waste mental capacity on it
I don’t think it is an ill-defined problem, and I don’t even think it’s that hard to judge. For example, words that are very short or very common in English/programming have higher collision risk. Of course, it could be quantified further by mining codebases.
Of course, but you also have to set a threshold, and as @rdeits said, if all your names are below this threshold then not exporting anything is not more useful than exporting everything.
That’s an anti-pattern in Julia unless the names share semantics to a large extent. I definitely don’t want
close(::GarageDoor)
close(::BankAccount)
to be in the same namespace with a common name, that’s like a rake lying in the grass, ready to hit me on the nose.
Personally I think 1 and 2 are fine options, depending on the API and whether it is intended for interactive use or package code. For the latter, using everything explicitly is fine and is a good habit.
I also prefer 1 or 2. I find good names very important for a package’s usability, so better use the “best” name you can find for functions. Also, if you export a name for something the user doesn’t use, they will not notice:
using DataFrames
# This causes no error although `join` is exported by DataFrames.jl
join = 23
Here join can be defined by the user because they haven’t previously used the definition from DataFrames.jl. (But that doesn’t help in the case of two packages exporting the same name, and you want to use one of them.)