I see several reasons why this setup is often used and works reasonable well in Julia
Julia code bases tend to be smaller than in many other languages, especially compared to C++.
Methods are open for extension and compilation is deferred until they are called. Thus, there are few hard dependencies – mostly types need to be defined before methods can dispatch on them.
It might also be a historical relict from Common Lisp
Except for things introducing compile-time instances, i.e., types or macros, you can actually refer to them before they are defined. In particular, if your files only contain methods they can be included in any order without issues. Thus, it’s often rather easy to align the inclusion order of files, i.e., first macros and types and finally methods (in any order). In particular, Julia does not force you to define methods in the same file as types – yet, in smaller code bases I tend to keep types and methods on these types in a single file.
To me it looks like the Julia design encourages one developer per package (which becomes a monolith with tight coupling); a team developing in parallel would be working in a package ecosystem (perhaps with a local registry).
This may have a bit of overhead but has the advantage that relies and benefits on the package manager for dependency handling, version checking etc) and may allow for better scaling than traditional options (intra package imports and the likes). The downside from a single developer perspective is that following local dependencies is a bit of a mess.
As the only Julia dev in my team (I’m the one doing mathy stuffs), one common issue is to propagate the updates “up” across dependencies.
In such case, I typically end up having a sandbox project, where I “dev” all relevant packages. This adds a lot of flexibility, to do a lot of trial and error with the REPL. But this can also quickly generate some mess if you want to register new versions of each package. It should be done in a clean and ordered way (you have to “free” each package in order, resolve, and test regularly that everything is OK).
I don’t pretend this is the optimal workflow (especially in regard of the “team” structure), maybe I went too far in the “spit package” approach, or maybe this is the sign of some tight coupling (not in terms of definitions, but in terms of usage). In any case, I wish there’d be a simpler way.
Conversely, I like the approach used in Go, with packages defined separately inside a module, that can be included separately (mind that meanings of “module” & “package” are swapped compared to Julia).
I can’t find it now but Matthijs Cox at ASML just published a blog post on their internal workflow (as I understand it they have hundreds of Julia users internally and since the demise of Invenia are now probably the biggest production Julia users) and it was based around a local registry.
Tribe was slashing its internal valuation of Canadian-British startup Invenia, on which it had bet $30 million, by 95%. Invenia co-founder and Chief Executive Officer Matthew Hudson had been “terminated” and a board-led investigation found he’d “secretly, systemically and repeatedly inflated the revenue and profitability of the company,” according to the memo, which was sent by Invenia board member and Tribe CEO Arjun Sethi.
I see what you mean. The approach you took seems to me a sound one to take despite the drawbacks which I believe are present everywhere. Furthermore, this reminds me of - if I remember correctly - The Cathedral and the Bazaar by Eric Raymond where he posits that a way for devs to stay engaged and happy is to be owners of their software (by owners I mean sole developers with some autonomy and ideally, copyright). In that case it was more like moving the module inter communication from internal API (case for big corporations and software systems) to external/public APIs relying on some OS provided protocol i.e. file IO for UNIX for example. I wonder if this may be a barrier for development of very large codebases w internal APIs, as @MacKa 's observations seem to point to.
As an aside, this piqued my interest and I just finished reading it. What a great essay! The Julia project is clearly set up as a Bazaar, and I think that comes with some surprises to certain users. Areas without polish are really just problems waiting to be solved by whomever is able
Sorry, late for the discussion, but let me give my two cents:
The best option is a mix of options 2 and 4, which by themselves are not really options I see someone considering.
A submodule should only be included where it appears in the hierarchy for an external user. If a package Graphs.jl has Graphs.Algorithms then Algorithms.jl should be included inside Graphs.jl, if it is Graphs.Utilities.Algorithms then then Algorithms.jl should be included inside Utilities.jl which is included inside Graphs.jl. In other words, include should be only used to build the unique hierarchy of the modules in the project, it is not a tool to load code that will just be called. What however, if Graphs.Experimental.Search needs to use some methods from Graphs.Utilities.Algorithms? Simple, in Search.jl you do:
One common pain point I find with nested modules is that each time you need to redefine all using and import dependencies. For my use cases it is pretty obvious that a nested module should have at least the same dependencies and namespace visibility as its parent. I see how this could create much noise in multi nested cases, but in my opinion if you don’t want this, you probably don’t even need nested modules. (Happy to hear counter use-cases)
Such a feature could probably be implemented with a macro, but unfortunately macros don’t play well with LSP, since this aspect was already mentioned above.
You are right, @reexport can already do this.
A small example for reference, since I already tested it:
@reexport using Graphs
s3() = SimpleGraph(3)
s5() = SimpleGraph(5)
# using Graphs # --> this line is not needed anymore
s1s3s5() = [SimpleGraph(1), s3(), A.s5()]
On thinking this twice, @reexport will nicely propagate the dependencies in inner modules, but it will also pollute the user’s session in case the user makes a using A. So I wouldn’t prefer it for more complex (and realistic) scenarios. I think the unique solution is to write the boilerplate code (e.g. the using Graphs etc.).
The question is why nested modules start with a 0 dependencies (e.g. Graphs) and don’t inherit their parent’s ? Or, to be more dramatic, what features do nested modules actually bring into the table, that are developer attractive ? (other than logically categorizing structs/functions which is mostly user-focused)
I think such a behavior may create dependency problems (i.e. create depedency cycles) and would be most unwelcome. The rule is namespace isolation for safety. Therefore the developer is required to be explicit or intentional.
For example, in clean architecture patterns (i.e. where functional dependencies run only towards the higher level modules), one should only import the interfaces (in Julia, import functions to be overloaded for example) that the parent module exposes as such (although such thing is not enforced in Julia; happens at package level with all the modules ending in <SomethingSomething>Base). This makes it easier to work independently on larger codebases as only changes to the interface will break lower level functionality and changes to lower level code will not affect higher level functionality. More on this stuff here. I am not sure if this is the case - but the Pkg dependency resolver should handle this sort of things at higher level and I remember some efforts years ago towards this - probably long solved by now.