How does the module system actually work?

Currently all of my files in my project look something like this:

module ConstraintSetMod

using ..DatasetMod

export ConstraintSet, ...

struct ConstraintSet
  ... fields ...
end

... functions ...

end

And I have a file called include.jl that looks like this:

# Include all files

include("types.jl")
include("dataset.jl")
include("constraint_set.jl")
include("cluster_set.jl")

carefully arranged in the right order such that each file depends only on files that come before it in this list.

It really seems like I’m doing something wrong here, but I have spent a lot of time looking for alternatives without finding anything that looked reasonable. How does the module system actually work?

2 Likes

Julia projects are typically just a single module - no reason to introduce/manage multiple namespaces if you don’t need to. You’d have a top-level file ConstraintSet.jl that looks something like this:

module ConstraintSet

using ExternalDependency

include("types.jl")
include("dataset.jl")
...

include is basically a copy-paste operation - it requires no assumptions about the module structure of whatever file it’s including.

1 Like

This strict dependency in declarations (that you see in C/C++) is not necessary in Julia. Having said that, I do find that simplifying your code to avoid circular dependencies is typically a good idea (whenever possible), as it makes your solution easier to read.

Going back to organizing your “module system”: In Julia, “software modules” are called “packages”, and I recently made a PR to help people get started:
→ FAQ: Creating your first package by ma-laforge · Pull Request #39186 · JuliaLang/julia · GitHub

But, to make it easier to read, here is a copy of it:


Creating your first Julia package

A quick way to start any new package is to create a skeleton with the Julia package manager:

julia> ]
pkg> generate path/to/my_package_repo/MyFirstPackage

The path/to/my_package_repo directory should now have the following contents:

my_package_repo
└── MyFirstPackage
    ├── Project.toml
    └── src
        └── MyFirstPackage.jl

Using this method, a [UUID](@id man-code-loading-uuid) is automatically generated for the new package and written to Project.toml.

Testing/developing your new package

An easy way to make packages available to the active Julia environment is to add a package repository to the LOAD_PATH variable:

julia> push!(Base.LOAD_PATH, "/abs/path/to/my_package_repo")

To automate this process and make the repository available for every new julia session, add the above statement to ~/.julia/config/startup.jl.

Hello World!

Since pkg> generate automatically creates a greet() function, you can call it after loading MyFirstPackage with either import or using:

julia> import MyFirstPackage
julia> MyFirstPackage.greet()
Hello World!

Alternative: Testing/developing your new package

It is also possible to add new packages-under-development to a given project/environment using [pkg> dev /abs/path/to/my_package_repo/MyFirstPackage](@ref Pkg). However, beginners might want to keep to the LOAD_PATH solution until Julia [projects & environments](@ref man-code-loading-environments) are well understood.

Organizing package files

There are no strict rules on how to organize package source files, but the following is a good starting point:

my_package_repo
└── MyFirstPackage
    ├── Project.toml
    └── src
        ├── MyFirstPackage.jl
        ├── component1.jl
        ├── component2.jl
        ├── component3
        │   ├── subcomponent1.jl
        │   └── subcomponent2.jl
        ├── component4.jl
        ...

In this example, a “component” could be anything that warrants being in a seperate file. For example:

  • A set of functions to operate on a given type (like an “object” definition).
  • Code used to display objects of multiple types (ex: collection of show methods).
  • A collection of type definitions.
  • A given software layer (ex: the external interface intended for users of the package).
  • …

Here is a slightly more concrete example:

my_package_repo
└── MyFirstPackage
    ├── Project.toml
    └── src
        ├── MyFirstPackage.jl
        ├── types.jl
        ├── mainalgorithm.jl
        ├── FileFormatA
        │   ├── FileFormatA.jl
        │   ├── reader.jl
        │   └── writer.jl
        ├── FileFormatB
        │   ├── FileFormatB.jl
        │   ├── reader.jl
        │   └── writer.jl
        └── display.jl

With such a file structure, MyFirstPackage is assembled by loading code from individual files (using include() statements). The following illustrates how this can be done:

src/MyFirstPackage.jl

module MyFirstPackage

#Import EXTERNAL packages required by solution:
using FFTW #Does frequency analysis
import Gtk #Needs a GUI (import avoids namespace pollution/collisions)
...

#Include INTERNAL project files themselves:
include("types.jl")
include("mainalgorithm.jl")
include("FileFormatA/FileFormatA.jl")
include("FileFormatB/FileFormatB.jl")
include("display.jl")

#Convenience functions:
readAdata(filename::String) = FileFormatA.readdata(filename)
readBdata(filename::String) = FileFormatB.readdata(filename)

...

end

src/FileFormatA/FileFormatA.jl

#Outside: Namespace is still "MyFirstPackage"

module FileFormatA #Preference: solution is cleaner with separate namespace
#Inside: Namespace is "MyFirstPackage.FileFormatA"

#Import EXTERNAL packages required by this "sub-module":
using DataFrames #Build on readily available reader/writer
...

#[Probably add type definitions here]

#Include INTERNAL project files themselves (file-relative):
include("reader.jl")
include("writer.jl")

...

end

On a conceptual basis, src/FileFormatB/FileFormatB.jl would be similar to src/FileFormatA/FileFormatA.jl.

7 Likes

I somehow gained the impression that messing with LOAD_PATH was at this point discouraged?

4 Likes

Hm, interesting. This structure actually seems very similar to Rust’s module system, except that include does not implicitly put a module around the contents of the file. In Rust there are separate mod and use statements, and it seems like include corresponds to mod, and using/import corresponds to use.

I’ll have to play around with it a bit.

As for LOAD_PATH, I also got that impression. I did see it when looking around for solutions, but always accompanied with warnings about it being a bad idea.

Though missing a bit of explanation: @stillyslalom is correct when stating that Julia [packages] are typically just a single “module” (even if it might apply to “projects” that are not simultaneously “packages”). I say this because the terms you are used to might not have the same meaning in Julia.

The main reason why this is a correct statement is that a Julia “module” is actually what you would call a “namespace”. It does not represent a “software module”.

I have made a few additional posts that might help you:

(Though you might not want to read about “Multi-package repositories” at this time)

2 Likes

As a suggestion for your documentation PR, I think it would be nice to include an example of how to import things from other files of the same project (and not just in the REPL).

I assume that with “namespace”, you are referring to C++ terminology here? I am not really familiar with C++.

I do not know if that is C++ therminology. But here they are only saying that modules allow you to define, literally, different name spaces, meaning that things can have the same name in different modules:

julia> module A
         x = 1
       end
Main.A

julia> module B
         x = 2
       end
Main.B

julia> A.x
1

julia> B.x
2

Thus, strictly speaking, modules in Julia are that, namespaces. Generally in one package a single namespace is necessary (you won’t have variables and different functions with the same name spread in the same package - because multiple dispatch solves the possible need of multiple methods with the same name acting on different types of variables). But, if you really need that, you can have more than one namespace and, thus, more than one module in a package.

1 Like

Check out the 1.6/master docs on modules, they have been reorganized:

https://docs.julialang.org/en/v1.7-dev/manual/modules/

4 Likes

The include system isn’t very user-friendly, it’s true.

You might like FromFile, which looks to resolve these issues, by making it so that you never have to write another include again.

(Note that technically this – and your post – are about how to organise the dependency structure between files, which are not the same thing as modules.)

3 Likes

You may also be interested on the section “Developing Julia Packages” in my Julia Concise Tutorial, that has several examples on modules and packages…

2 Likes

Well, I myself don’t particularly like the idea of using global variables like these to specify a library, but I don’t actually have a good reason for this (other than an unease of LOAD_PATH possibly changing over time).

That said, I don’t think it is that bad of methodology. You do need a mechanism to tell Julia where to get its libraries. It might have been nicer to do this through a API/function call, (Maybe a config “file”/structure - like how registries are added?) but it still seems like a reasonable solution for now.

Cannot find warning

I am curious where you saw this warning. I myself vaguely remember reading this, but I can’t find it anymore (did I imagine this?).

Could you please post a more specific reference? Thanks.

Overall

Overall, I think of LOAD_PATH as an alternative to building your own custom package registry. Which is good, because I still haven’t figured out know how to build my own package registries.

I think it is relatively safe, because you could customize your Julia launch script to bundle in whatever package libraries you wish to use for each project you develop:

bash $ julia_satellite_imaging
[adds custom image processing libraries to LOAD_PATH]
[...and loads julia, of course:]
julia> #Ready for processing satellite images

So you don’t necessarily have to add your custom packages to ALL executions of julia (ex: if you added it to your ~/.julia/config/startup.jl).

And once you are satisfied with your set of packages (you might have added/removed/merged/renamed them along the way), you can “publish” them to the General registry.

Alternatively, you can “publish” them to your own internal registry (a custom “library” of sorts) - especially if you are developing an internal product for your company.

That is correct. Sorry. I just read that Rust implicitly creates “namespaces” for each file - and that the term is not really used by Rust developers.

But the fact that you are familiar with Rust does explain why you were worried about ordering declarations like this (I would guess that aspect of C++ got carried over to Rust).

Excellent! I was not aware of this update. I haven’t finished reading, but it seems like a definite improvement to understanding modules.

Actually, the ordering is not important in Rust. If the ordering is not necessary, then I just made a mistaken assumption about Julia’s include. Probably mostly because it takes the file as a string, so it looks a lot like an eval sort of thing, which always makes me very careful.

To be specific, my thought process was:

  1. Include is not a language construct, but like an actual function that has to return, and therefore it is not able to see code that will be included in the future.
  2. If the compilation of the included file tries to use a function that is defined in a later include, it will just appear like a call of a function that doesn’t exist, which should of course return an error.

My mistaken assumption here was that the compiler doesn’t actually return a missing function error until the code runs, so the above is not a problem. (Though this makes me sad, since I have to spend a lot of time waiting for the running code to reach the offending part to figure out where I have misspelled a function or messed up the argument order, and then when I fix one issue, I have to wait again.)

1 Like

Nevermind I guess I was right about the order thing. I just tried to mix up the order, and I got crazy errors where a variable defined before its super-type did not actually become a super-type of that abstract type.

I’m guessing this happened because it actually become a sub-type of the old version of the abstract type from the previous call to include("include.jl") that I had done previously in the REPL.

I think that yes, order of include("something.jl") may be relevant. But often it is not, because if you are only including functions (and not executing any function calls outside of them), then all functions will be included before anything breaks. However, if you are toying with global variables, types, and general code executing in global scope, then yes, the order would matter.

2 Likes

I have no global variables, and all of the types and functions are wrapped inside a module. The include call doesn’t run any code besides defining types and functions. It matters because the modules that it defined were in the global scope, and a using call in one file caught a module from a previous invocation of include.