[ANN] PatModules.jl: a better module system for Julia

You tend to do a lot of long import lists which is what this package gives sugar for:

Of course it’s just sugar, and in this case it doesn’t really reduce the writing all that much, but that’s one of the main packages that use submodules that comes to mind.

I see. Well, this is one of the things that I believe improve the access to the logic (readability of the code). The top module is the place where the reader can find where each function and type comes from.
It is also the only place where exports occur. Again, improving legibility.

4 Likes

By the way, one of the reasons that Julia doesn’t use namespaces as aggressively as Python is because we group related functionality into generic functions. Instead of having List.map, String.map, Tuple.map, etc, we just have one generic function map. The emphasis is on overloading generic functions rather than putting slightly different versions of functions in separate modules. In order to fully take advantage of multiple dispatch and function overloading, you want a pretty flat namespace.

I think one takeaway from this thread is that sometimes it’s good to solicit feedback from the community before embarking on a new project, rather than after. It’s not uncommon for new Julia users to come to this forum and say, “You’re doing it wrong.” Unsurprisingly, those posts usually receive a bit of pushback. :stuck_out_tongue:

(The List.map, String.map, Tuple.map example is taken from languages like Erlang and Elm.)

8 Likes

Haha, to be clear, none of what I’m saying is meant to be a “you’re doing it wrong”. I expected a fair amount of pushback when I published this actually, as I can see that the Julia community is nothing if not strongly opinionated.

One thing I particularly like about Julia is that one is the ability to construct these sorts of import systems if necessary. So you can code Julia your way and I can do it mine and we can both be happy. (Admittedly the metaprogramming involved is a bit of a barrier though.)

7 Likes

Haha, well the title “a better module system for Julia” implicitly means “You’re doing it wrong.” :wink:

5 Likes

You may be interested to know about this technique to find this out using the REPL

Though I admit having to load a package to ask where something is defines is a bit of a twist, and if you are just reading the code is kind of annoying.
It’s a fair complaint.
(Though, I personally prefer it to the trade-off of listing all imports. But I understand others disagree)
Just thought i would bring up this “trick” of using the REPL to ask where something is defined even if you don’t know the module, as it is I think an under appreciated.

1 Like

I’m a bit skeptical of this point–including a file multiple times in Julia is nearly always a mistake. While I completely support the notion of helping users avoid making mistakes, I don’t think this situation is anything like C or C++, where including a file multiple times is normal and required.

8 Likes

Don’t know if anyone has mentioned it though, but PatModules.jl as a name, 10/10 rolls off the tongue. I wouldn’t change that.

Indeed, I think dispatch and functional styles really changes the way that code is written. Explicit function names for functions that are used everywhere means I just generally write code assuming some idea like max is extended to what I’m working on. I don’t really check what exists but just use the functions that do :man_shrugging: Hard to explain but it’s like the Plots alias system.

5 Likes

Fair comment!

I don’t think I agree. For example I’m writing some code for a variety of neural network models. In common.jl I want to factor out MLP, and then use that as a building block in both neural_ode.jl and neural_sde.jl. The solution to this so far has been to rely on both neural_ode.jl and neural_sde.jl being included in just one place, and have that place include common.jl on their behalf… which is the kind of lack-of-dependency-tracking that I’m not a fan of.

Haha, thankyou!

I think it’s worth unpacking the nuance behind that question. Namely, why #include is such a pain in C/C++, why Julia seems to have includes despite being a newer language and if the aforementioned pain points apply.

From a modules/namespacing perspective, the C compilation model has 4 main quirks:

  1. There is only one global namespace. C++ adds lexical namespacing, and as you can imagine that helps immensely.
  2. Declaration and implementation are split between header and c/cpp files. As I’ll address in a second, this is a minefield and practically no modern language retains this approach.
  3. Includes are not syntax aware and can paste code anywhere. Think using an include to add the signature for a function!
  4. The default object-based linking model necessitates inclusion of the same header in multiple locations. This dramatically increases the chance of multiple inclusions per object and thus header guards.

Now to Julia. Julia’s module system is, as far as I can tell, a near copy of Ruby’s. Contrasting to the C/C++ system:

  1. Namespaces are pervasive and not purely lexical (modules are first class “objects”).
  2. Declaration is almost always implementation and thus source files are included directly.
  3. include is syntax-aware and must only introduce fully-formed constructs (types, variables and functions for Julia). No literal copying and pasting of partial functions.
  4. Most importantly, code is extracted into external files and then included only as a space-saving measure. This is in contrast to something like PHP where common includes were/are used all the time. Conceptually, this means that reversing all includes and inlining everything into one .jl file is semantically equivalent. It also means that duplicate definitions are essentially non-existent unless someone purposefully writes them out multiple times.

Conceptually, you can think of this as a “unity build” in C/C++ projects, where source files are included directly and only limited headers are required for external dependencies. Incidentally, unity builds do not suffer from many of the #include-related pitfalls that normal C/C++ projects do, but face cultural aversion and a lack of tooling support. Neither of these concerns are present in Julia.

One point I do agree upon is tooling and discoverability. Here though, I’m not sure include is primarily to blame, but using. As an example, compare browsing through C# and Java projects on GitHub. The former only has using and thus makes finding what comes from where difficult, whereas the latter primarily uses import.

However, I’m going to pin this on the tooling and not the language. For example, looking through Python projects was a pain before the new go-to-definition functionality because of varying PYTHONPATHs. Github’s search functionality is also a dumpster fire on the best of days.

Thankfully you don’t have to. JuliaHub is an invaluable resource (I would say essential infrastructure at this point) even when using an IDE. For local code, there is a LSP plugin for pretty much every mainstream text editor out there.

Philosophical tangent/rant: Despite being a big fan of Vim, I believe the “UNIX is my IDE and text is all you need” crowd has set us back at least 20 years in PLT and PL tooling design. It’s nice to see a) C losing mindshare and b) fewer languages catering to that crowd.

PS: Neural CDEs are great :slight_smile:

12 Likes

I think people have been mixing up the concept of a file and a module - they are two different things:

  • when one calls include that means include the file, it should allow including the same file multiple times, cause this is what include means
  • when one load a module, one should be able to load the module multiple times without creating multiple definitions

The concept of module/namespace has been mixed up by people in many different languages, and for languages like Python, it is mixed up intentionally by the designers. However, they are not exactly the same thing.

I 100% agree that a better practice is not to use include at all, this is something that by definition requires the programmer to manage files and code dependencies that can very likely to go wrong instead of handling by the compiler, e.g the order of include needs to be carefully handled, this creates extra burden that shouldn’t be there.

split and wrap into a package is workaround, not a solution, because this feature is not implemented yet.

There has been quite a few issues addressed in issue 4600, this feature should be about importing not including. I feel we are repeating what has been discussed around 4600 in this post. Maybe people just want to read the discussion in that post first and other issues referred to that post first.

1 Like

The splitting of Pkg into modules is very annoying and something I should really undo at some point. They all just import each other and it adds no useful structure.

7 Likes

Interesting… I found Pkg source reasonably understandable to navigate, and the distinction between e.g. API.test() and Operations.test() makes sense.

The API module does make sense, but the separate modules for Types, Operations, etc are just a mess. Every file has a huge stack of imports from sibling modules at the top that aren’t necessary at all because it’s not like there are name collisions. And adding new functions or types is unnecessarily annoying because the pointless module structure makes it hard to figure out a good place to put things.

9 Likes

I think that this is a misunderstanding: modules usually include their own files, while all other code usually uses the code loading mechanism (ultimately using / import with project files).

7 Likes

I shan’t try and address all the points above, but to head off this one - there’s no misunderstanding. What you are saying is precisely the problem: include is not a good way to organise code, as it (a) enforces that the files of your module be a tree rather than a DAG; (b) demands that the parents in this tree include things on their (sub-sub-…) children’s behalf.

5 Likes

I am not sure what you mean here, since directed trees are DAGs.

If this is a problem for you in practice, that’s usually a good sign that you should organize your code into modules, and let the loader build up the tree/DAG. This works fine.

Generally, I am afraid that you were a bit hasty to conclude that

since this is not a problem for Julia programmers in practice.

3 Likes

Correct. But not all DAGs are directed trees.

The loader is incapable of building a general DAG. This will result in duplicate definitions.

I stand by this statement. But I can certainly tell that that’s not the prevailing sentiment here. Something which astonishes me, frankly. This isn’t even a debate in any other community.

I’ve seen the code Julia programmers write in practice. (In the various major packages.) Respectfully, it is not of the quality I would expect from a modern language that claims to have things worked out.

3 Likes

I think you mean to say that it’s not organized in a way you find optimal. Surely you understand that saying the code in all the major packages you’ve looked at is low quality is not, in fact, respectful.

8 Likes

If I do using Tables, and then using DataFrames, the later of which also does using Tables, there’s no duplication of definitions, is there?

You might take this as an opportunity to evaluate some of your assumptions. Given your initial statement that you love everything about julia except for this, I take it you recognize the care and thoughtfulness with which the language was designed. It is certainly possible that we all have blinders on and this really is a wart that needs addressing (if so, kudos for trying to address it!). But might it also be possible that there’s something you’re overlooking?

I have had countless experiences with this language running into something that seemed like a mistake, or was unintuitive, but in almost every instance, once I spent some time reading up on the subject, asking questions here or on slack, and learning why things were the way they are, I came to appreciate the design.

I’m struck by the fact that you created your discourse account a week ago and don’t have any other posts asking about how people organize their code, how to avoid duplicate definitions etc. I’m not everywhere on slack and zulip, but I don’t think I’ve ever seen you post there (we have #gripes channels, I bet you would have received good feedback there). It seems as if you looked around, judged a bunch of code you saw as low quality, then jumped on here announcing your “better way.” If nothing else, I think it’s a bit naive to assume there wouldn’t be pushback.

I think it’s great that you saw a need and tried to fill it. That’s the kind of attitude we want in this community. What I don’t think we need is someone that sees only one right way to do things, fails to solicit or listen to feedback, and insults people that work in a different way.

16 Likes