Restructuring code from modules to package

I have developed an application that makes use of 20-odd modules. Coming from Matlab, I have ensured that Julia finds my code by adding relevant folders to the LOAD_PATH. The user uses my application by using relevant modules. The dependencies between my modules follows a (relatively complex) direct acyclic graph.

The time has come to package the code to make is distributable. I have the right folder structure, with /src, /test, Project.toml and so on, and I “] dev” the project.

Trouble appears when I zero the LOAD_PATH: my primary module (which has the same name as the project) does not find the secondary modules (that are neatly stored in \src).

Whence my questions:

  1. Can a package make more than one module available for using (by both the user and other parts of the package)?
  2. Specificaly, is there a way, to make the package, on compilation, to add the relevant folder to the LOAD_PATH? Edit: for the purpose of making the secondary modules available.
  3. includeing modules as submodules into the primary module and exporting them would be a way (with small API changes, no problem) to make secondary modules available to the user. I have not tested (lazy) but I am concerned that making a secondary module access another secondary in this way (through the primary module) creates a cycle in the dependency graph (the primary module uses or includes ‘all’ secondary modules), and so I can not quite see that working.
  4. Am I persisting in unJulian ways, and must think one GIT repo==one package== one module, and boil things down to 3-4 packages with dependencies (some work for me to get my code there…)

Thank you in advance!

I think some more info about the structure of the code is needed to make useful suggestions.

However, 20+ modules seems like a lot. I suspect that you would be better off combining most of them into a much smaller set of modules.

If you want multiple modules in one package, the usual approach would be submodules.
The main file would contain something like

module MyPackage

include("sub1.jl");
include("sub2.jl");
using .Sub1, .Sub2

end

and in sub2.jl:

module Sub2

using ..Sub1

end

With 20 modules, this gets hard to follow pretty soon.

Typically, there is either a subset of modules that are of independent use. These make natural candidates for a package (with 1 module). Or there isn’t a natural second package. Then the question is: should you simply have one module? What would be the drawback of this?

Another structure is to generate a module that contains struct definitions and stubs for methods. Other modules then use this module and extend the methods. Whether that fits your use case is hard to say without knowing more about the structure of the code.

On your second question: once you made a package, you add it to your environment and that takes care of code loading (including all dependencies). So the LOAD_PATH is no longer something you need to worry about.

On the fourth question: there is now a way of having multiple packages in one repo, but I think that’s a separate issue from code organization.

1 Like

Hi,

Well, my code structure… coming from Matlab I started off with one module== one object, not that I insist any more.

I have edited my question #2: would the loadpath be a way of making the secondary modules available

Separate question, but a pointer to the doc on having multiple packages in a repo would interest me very much.

But yes, I do see how “one package==one module” constrains… correction: gently guides towards a modular design.

:grinning:

Since you asked, here is the using graph (but I do not expect anyone to bother with this list).

Lithe is a “just add your element” FE code. My idea is a that I new element family will be programmed - you know me - as a new module.

# Demo code for element developer, could thus be left out of the package 
Use(:SimpleTetragonElement,   [:Lithe, :Espy, :Dialect,:Elemental,  :Dots, :Materials])
Use(:SimplestTetragonElement, [:Lithe, :Espy, :Dialect,             :Dots])
Use(:VolumeElement,           [:Lithe, :Espy, :Dialect,:Elemental,  :Dots, :Materials])
Use(:VolumeMaterial,          [:Lithe, :Espy, :Dialect,             :Dots, :Adiff, :NewtonRaphson, :Materials])

# Core functionality, solve, display, export results
Use(:ElementTestBench, [:Lithe, :Dialect, :Espy])
Use(:LitheGraphics, [:Lithe, :Dialect])
Use(:GraphicMesh, [:Dialect])
Use(:Lithe, [:Dialect, :Adiff, :Espy])

# Metaprogramming operations on element code, Espy and Express could be merged.  
Use(:Espy, [:Dialect, :Express])   
Use(:Express, [:Dialect])             

# components the element developer might want to use
Use(:NewtonRaphson, [:Dialect, :Adiff, :Dots])
Use(:Elemental, [:Dialect])
Use(:Materials,               [])

# used everywhere
Use(:Adiff, [:Dialect])      # could be a package, my take on automatic differentiation
Use(:Dots, [:Dialect])     
Use(:Dialect, [])

# used when writing the input file (Julia script) of a FE analysis. 
Use(:MeshReader, [:Dialect]) # thin wrap around AbaqusReader
Use(:Unit, [:Dialect])     # could be a package

This is maybe a question, maybe an answer, but doesn’t something like

module Main
    include("./module1/Module1.jl")
    include("./module2/Module2.jl")
    using .Module1, .Module2
    ...
end

just works if Main is in src and module1 and module2 are subdirs of src ?

But in Julia this is more of a headeach. Modules are more useful to separate a set of structures and functions all that define a more or less closed set of functionalities.

You probably find the answer to the second question here:

though I think it is unlikely that this will be a good alternative to that of simply flattening the package, unless your package is already really big.

Hi, thanks,

I was thinking in the same direction

module Main
    include("./module1/Module1.jl")
    include("./module2/Module2.jl")
    using .Module1, .Module2
    export Module1, Module2
    ...
end

so the user can do

using Main
Module2.foo()

which is a nice syntax.
But now: Module2 uses Module1.

module Module2 
using Module1
end

does not work (not on the path and not an installed module/package and

module Module2
using Main
end

which I haven’t tried, smell of cyclic dependency.

And thank you for the link to the multiple packages in a repo!!! :grinning:

Definitely that is not a good idea.

If you think the user can use Module1 without Main, than make of Module1 a package and add it as a dependency of Main.

Also, it is possible that you are overusing modules to avoid name conflicts in a way that is not needed by Julia because of multiple dispatch, one should not think of a module as an “Object”.

I clearly am overusing modules, I now see :grin:

20odd modules will become 4odd packages, living, to begin with, in the same repo. I have something to look forward to over the week end!!!

Unfortunately, I don’t fully understand your issue, but I believe I can point you to some useful resources.

I recently wrote a high-level overview of modules & packages in an attempt to alleviate confusion:
Could we make first-class support for packages that are "just files" and not repositories? - #20 by MA_Laforge

Somewhat similar (but less well explained in my opinion - read this last - if at all):
How to add a folder path in your .jl script - #8 by MA_Laforge

Creating your first package (incl. file organization)
How does the module system actually work? - #3 by MA_Laforge

Tips for include vs import/using
Proper way of organizing code into subpackages - #3 by MA_Laforge

Tips for directory structure/LOAD_PATH
Proper way of organizing code into subpackages - #4 by MA_Laforge

Link to more relevant threads:
Proper way of organizing code into subpackages - #5 by MA_Laforge

I think the key questions are:

  • What is the benefit of having more than 1 module in your case? Name conflicts? Easier to understand code logic?
  • Will users use the sub-modules independently? Or will they always use the main module?

If users use sub-modules independently, this looks like a case for a separate package.
Given your description of complex interdependencies of the modules, it sounds like having multiple modules complicates reasoning about the code. Then I would likely get rid of the sub-modules entirely.

I actually went through something quite similar to your experience when I transitioned from matlab to Julia. I started with lots of modules (easier to reason about, I thought). Then I ended up with confusing imports all over the place. Finally, I figured out what part of the code was truly independent of the rest and factored that out into a couple of separate packages (which will never be used independently; it’s all one big model). But the packages help reasoning about the code and also with editing and testing (testing is faster).

MA_Laforge, I will definitely read this, thank you!

I actually went through something quite similar to your experience when I transitioned from matlab to Julia. I started with lots of modules (easier to reason about, I thought). Then I ended up with confusing import s all over the place.

This is a good summary of my own story!

Name conflicts was not, generally, pushing me to create modules.

When cleaning up, I will keep some packages separate: they provide functionality that may be useful outside my current project. My project will reexport relevant functionality from these packages.

My project has the peculiarity of having two types of users: those that will create new finite element types, and those that will use project+new elements. To avoid cluttering the later users namespace, I am considering segregating the functionality that supports element development in its own submodule (or simply not exporting it…).

Thank you Hendri for your kind help!

I would make the code that is independently usable into a package rather than a sub-module. It’s easier for the user and (I think) also easier for the code author.

Yep, I see that.
I might be doing a mistake, but I think of going through an intermediate step with a single Package, get that to work, and then split into packages to maximize re-usability.