How about, instead of distinguishing projects/packages; top-level/reusable; whatever, each repository can just have certain characteristics or tags such as runnable, includable (bad terms, I am just making these up).
But if it depends on a bunch of global configurations which you may/may not have, is it really ārunnableā? The only thing Iād clearly classify as ārunnableā is a package which has tests on Traivs/AppVeyor (since I use both Linux and Windows) and good test coverage, anything else is suspect.
I think that is not the point. I am talking about labels, while you mean testing.
It should be. Otherwise what is the proposal for things to share? Untested code which ran on one computerās code at some point in time? Thatās a notebook file in a repo, not something that should be formalized.
Julia has the tools to make everything of use into a package, so that script like that could be small and all of the meaningful parts could be tested. That should be used.
That is what the travis test badges are for
Yes, and everything else is ārandom script in a Github repo, use at your own riskā. I donāt get why we need a name or formalisms for that.
When trying to communicate intentions, there are two extreme options:
- Write a README file where the intention is clearly stated in detail.
- Name the product itself after the intention.
The README approach does work, but is the most free-form and least standardized option, not to mention that some people donāt bother writing one, hoping that their code is self-explanatory.
Names are expected to answer the question of what something is. Of course that has multiple answers. Something may be Julia code, so we may call it a script. And may be reusable coherent code (ideally, all code is reusable), so we may call it a program. And may consist of separate functionalities packed into a single library, so we may call it a package. And may be code not intended for reuse, so we may call it what? A project? It doesnāt follow. It is the product of a project, not the project itself. An application? Maybe, although that word is used for compiled executables. Still what stops me from reusing that product? Its present configuration? How hard is it to change it? We want it as easy as it gets. In any case, intentions reside on the creatorās mind, not on the package itself. To the eyes of a programmer who intends to reuse it, it remains a reusable package.
Then come the non-extreme options. An option closer to name is the suggested tags/labels/badges. An option closer to README is a file of special format. I believe that both options are better than either of the two extremes. Maybe we can come with an option better than all, but it will be somewhere in the middle. The extreme name option is not the only alternative to README nor the best one.
@ChrisRackauckas I was in fact trying to suggest something to make the distinction more light-weight since I actually dislike a (to me arbitrary) division into packages and projects.
But I would very much like a much much smaller METADATA_CORE.jl
of ke packages that are carefully reviewed, where naming conventions are actually enforces, and so forth than the current free-for-all. There your suggestion makes sense
Iām afraid I have no idea what youāre talking about at this point.
Having runnable vs. reusable as independent properties is certainly possible and an interesting idea to consider ā thatās what I was alluding to in the last paragraph of this post. However, Iām not convinced that the combinations other than the ones I listed are sensible or practical. For example, Iām quite convinced that providing global runtime configuration makes sense if and only if it makes sense to ārun itā. Does it make sense for a package (i.e. reusable code) to also be runnable? It might in the sense of having usage examples.
No worries, I appreciate your effort either way. I wish I could express my point in Julia code, but the language hasnāt reached that level yet.
The bottom line is that Iām against the suggested distinction and especially calling some packages as projects. The suggestion for independent properties makes more sense, whatever form they may take.
FWIW for my purposes local packages do the trick perfectly. Even without some main function it is easy enough to structure that package in a way that the targeted user just has to call some function run_experiment(use_3d_pi_charts=false)
to reproduce the results (or achieve whatever the point of the experiment is).
That said, I am quite confident it would be a nice feature for scientists to offer an easy way to freeze some working code/package in time, if that sentence makes sense. As important as reproducible research is, it is quite a hard sell if one has to maintain the code after the paper using that code has been published. People want to move on to new things.
I agree that an easy, foolish-person-proof way to say āthis works as it is right nowā and respond to otherās requests for the workproduct to get the most recent of the "this works as it was right then"s unless the most recent workproduct, working or not, has been requested (like Pkg.checkout vs Pkg.add). I donāt think the the github versionā¦patch tagging facility is user/scientist friendly in that way. Although we could autogen a next stepped tag and META remember the tags have tagged āworks as was right thenā workproduct, my experience is that it is too easy to break tag assumptions.
Concerning Project vs Packages: why not just standardize where to put runnable scripts into packages as we know them now? Say a folder run
or scripts
and the main program would be run/main.jl
. Pure āProjectsā would have an empty src/
folder and full run/
folder and vice versa (most would have a bit of both). Similar to Pkg.test("SomePkg")
we could have a Pkg.run("SomePkg")
to run run/main.jl
.
Iām in favour of this idea in any event: Sometimes a package of code is written providing some functionality, and it can be used in other packages and projects then, but for conveinience, you might want some very simple programs that come with the package, which wrap a few functions in that package that can be called from the terminal. I recently made a package with code and functions for DNA sequence dating which is useful to load in other projects, but it is also useful to distribute a little program as you describe, that can be invoked from the terminal and accepts a few inputs and spits out an output.
Iām in favor of a run/
folder.
I think scripts/
is more ambiguous.
I posted my suggestion as issue over in Pkg3-Julep
https://github.com/JuliaLang/Juleps/issues/20
Similar to mauro, I also have tons of command line utilities in my Python packages. The users use them as libraries but have some terminal commands for common operations (they even provide them so I can add them in future releases).
I like how setuptools for Python manages this command line tools as āhooksā. You define in a configuration file the name of the command and the entry function to execute, like:
entry_points={
'console_scripts': [
'foo = foo.tools.cmd:foonction',
],
}
Many Python packages provide such tools and at least in our collaboration we use them on a daily basisā¦
Update ā terminology I ended up using in the Pkg3 documentation:
https://julialang.org/Pkg3.jl/latest/index.html#Glossary-1
Quoting the relevant parts:
Project: a source tree with a standard layout, including a
src
directory for the main body of Julia code, atest
directory for testing the project,docs
for documentation files, and optionally abuild
directory for a build script and its outputs. A project will typically also have a project file and may optionally have a manifest file:
Package: a project which provides reusable functionality that can be used by other Julia projects via
import X
orusing X
. A package should have a project file with auuid
entry giving its package UUID. This UUID is used to identify the package in projects that depend on it.
Application: a project which provides standalone functionality not intended to be reused by other Julia projects. For example a web application or a command-line utility. An application may have a UUID but does not need one. An application may also provide global configuration options for packages it depends on. Packages, on the other hand, may not provide global configuration since that could conflict with the configuration of the main application.
Projects vs. Packages vs. Applications:
- Project is an umbrella term: packages and applications are kinds of projects.
- Packages should have UUIDs, applications can have a UUIDs but donāt need them.
- Applications can provide global configuration, whereas packages cannot.
Iāve found this terminology to be intuitive and helpful. Having āprojectā as an umbrella term for both packages and applications is good since otherwise you find yourself using the overly long and awkward phrase āpackages or applicationsā over and over in situations where the term āprojectā is quite natural and easy.