I would argue if it’s a big enough application, then it is. For example, whatever you would put in separate modules might make sense to be separate packages. Especially any kind of separate components of the single application. In particular, I don’t think it’s important that a package has more than one “user” (e.g. a downstream module/package).
Note that the difference betwen packages and modules is pretty small, and in particular separate packages do not have to exist in different git repositories, they do not have to be hosted online, and they do not have to registered in General. The difference is that they aren’t nested (but you can have submodules within a package, and you can extend the functions of one package via methods in another) and need separate Project.toml’s, and if you want to make use of versioning and compat, you need to register them (in your own registry). I think one way to see packages is as independently versioned modules for which there are mechanisms to express dependency relationships and compatability constraints in the stdlib Pkg. In my opinion, they are a provide a great way to have a large scalable codebase (by which I mean able to scale to allow many developers or many components, or a lot of code-- not just scalable to have many users).
Indeed, I second the above. Effectively, I have one complicated package to compute something in my field, and recently I have split one part of it into another very small package that does a part of the necessary computations of the big package. The new one has already attracted more interest than the original big package.
Related to this, what I would like to see is a more permissive way to register package names in the general registry, specifically meaning allowing names that carry the github username along with it (such as myuser/MyPackage). That would foment the splitting of packages into smaller units without one feeling that one is “taking” a name that could better be used by someone else.
For example, I want to have a Gram-Schmidt orthogonalization implemented for a small use case, but I don’t want to carry all the dependencies of serious implementations of that (i. e. BLAS). I would like to register my user/GramSchmidt and have it as a dependency of my packages, but I would not dare to register a package with that name.
If the package looks like it could be of interest to someone other than you, it is fine to register it, even when very small. If it is of interest only to you, no need to register it at all!
Quoting myself from this response in the above linked thread regarding PatModules.jl:
One of the reasons that Julia doesn’t use namespaces as aggressively as Python is because we group related functionality into generic functions. Instead of having List.map , String.map , Tuple.map , etc, we just have one generic function map . The emphasis is on overloading generic functions rather than putting slightly different versions of functions in separate modules. In order to fully take advantage of multiple dispatch and function overloading, you want a pretty flat namespace.
(The List.map , String.map , Tuple.map example is taken from languages like Erlang and Elm.)
I never tried that. If I have a package registered in the general registry, can I make it depend on a package that is not in the general registry?
Other users of the registered package will have the non-registered package installed automatically when the registered package is installed?
How will be versions handled?*
(Now that brings me another question: what happens if someone deletes a GitHub repository of a registered package? Do the registered versions continue to be available?)
PS. I not considering the possibility of having a custom registry and telling the users of my package to add it. I don’t think this is a viable alternative for most users, custom registries soon enough become too messy for the final user.
They won’t be accessible by git, but the Julia package servers should store any versions they’ve seen forever, so hopefully everything will still work.
Here’s a concrete example of a problem which roughly resonates (at least in my ears) with the gripes of @panos.asproulis:
I am currently working on a multigrid solver which internally uses the conjugate residual method (an alternative implementation of MinRes) as a smoother. In an ideal world, I would like to split this code into three modules, namely one each for the multigrid and conjugate residual codes, and an additional AbstractIterativeSolvers module which introduces the abstract interface shared by the multigrid and conjugate residual solvers. Obviously, the dependencies between these modules are as follows:
AbstractIterativeSolvers depends on nothing.
ConjugateResiduals depends on AbstractIterativeSolvers
Multigrid depends on ConjugateResiduals and AbstractIterativeSolvers
The problem then is that as I am developing these modules, I want to be able to load each of these modules and all their dependencies, but also only their dependencies. So for example, when I am testing ConjugateResiduals, I want to be able to load only ConjugateResiduals and AbstractIterativeSolvers and not Multigrid because Multigrid might incur a parsing error due to some unfinished edits (research software development is messy…). The current module system does not allow me to express this relationship cleanly: if I put a include("AbstractIterativeSolvers.jl") in both the ConjugateResiduals.jl and Multigrid.jl files, then the two modules end up using two different copies of AbstractIterativeSolvers which defeats the point of introducing an abstract interface. On the other hand, if I don’t put these includes, then I must be careful to always reload exactly the right files depending on the changes I made.
Clearly, none of this is a problem if you are happy to either provide only a single, monolithic Multigrid module which takes care of loading all the dependencies in the right order, or to split AbstractIterativeSolvers and ConjugateResiduals into separate packages. However, both approaches involve a significant amount of overhead and complexity if all you want is to prototype your ideas on a timescale which is somewhere in between “I can hack this together in five minutes” and “I want this piece of code to last forever and I am willing to do whatever it takes to achieve that”.
Yes, that is of course the solution that Julia forces upon me. The point is that this is not a particularly great solution because it raises the question of where to place that single include().
But why do I have to distinguish between development and production? These things always go in a cycle, so I will have to keep switching my includes around which is a nuisance and a source for errors.
More abstractly speaking, the point is that code often has the structure of a directed acyclic graph, and the Julia module / include() system does not allow me to reflect that structure. I believe keeping that structure around would be useful throughout the life cycle of a piece of software, and putting all the includes in the “entry file” of a package does not do that.
Of course, this doesn’t mean that the current system is unusable. I’m just expressing my believe that there might be some room for improvement (though I have no idea what that improvement might look like).
I also think include is not a great way to organize code and one of Julia’s weak spots. I compared it to Go in another thread:
In Go, you can’t include files: you can import packages and one package = one directory and each file has an explicit list of imported packages at the top, and importing names in the main namespace (like using in Julia) is strongly discouraged. In Julia, files can include files can include files… Open a file, you don’t know in which context it is included so you can’t be sure how it’s going to be interpreted. Maybe it’s even included several times in different contexts? Maybe the behavior depends on the order of includes?
Another way to put it: declarative (rather than imperative) code is generally considered more robust or at least easier to analyze and verify. In Go, you write imperative code in functions, but the overall organization (definition of packages and functions) is declarative. In Julia, this organization itself is imperative.
^ I completely agree, and this is the key point IMO.
Open a file, you don’t know in which context it is included so you can’t be sure how it’s going to be interpreted. Maybe it’s even included several times in different contexts? Maybe the behavior depends on the order of includes?
And problems like this are the result.
It’s actually pretty straightforward to demonstrate genuine spooky-action-at-a-distance as a result of the typical include() pattern.
I’ll reiterate the mention of FromFile.jl from earlier in the thread – this package introduces a custom import system that fixes all these issues. Never write an include() again.