Julia Modules

The current situation with Julia modules is a nightmare with various keywords (include, using, import) and a lack of a simple and standard system for the compiler to find modules without playing around with source files. The same goes for exports. Rust has a nice and clean system of addressing these issues.

Are there any plans for version 1.0 or beyond to provide a solution for this problem?

1 Like

I think you’re more likely to get a good answer if you state the problems you perceive explicitly, and also describe specifically how you find Rust’s system superior.

7 Likes

All right then. Here it goes:

  1. I don’t like the include statement because it makes me need to know the directory and file structure of some code I want to use. I am talking about code not in Julia added as a package. This is like C/C++ header files, only in Julia it is complete code and not declarations only. In essence Julia requires you to play with the include statements so that you end up with a single file that contains the whole code you are developing and in the proper order. This is a lot of work.

  2. The order one includes files is significant because one file may define stuff another one uses.

  3. An include file can be anything from a single function to a module to a whole lot of other include files and so it is easy for someone to lose control.

  4. I don’t like the idea that if one wants to create other versions of a function he needs to import the original definition of this function. How am I supposed to know where the original definition is and why should I care?

  5. The using statement essentially exposes the exported contents of a module to the code that follows. If this module is installed in Julia as a package things are simple enough. If not one must somehow access the module, an include statement perhaps, and then call using.

  6. On top of that using accepts a dot or two dots in front of the package depending on whether it is at the same level of code or outside on another level like another module etc. Total confusion.

I think that the developer should not be bothered by including files and code levels and imports. In Rust there is a specific directory structure and naming convention for the use of modules so that the developer only needs to specify the module it needs to access it with a clear and simple statement. It is then up to the compiler to find where the module is and how to use it in order to compile the code that calls it.

1 Like

I don’t like the include statement because it makes me need to know the directory and file structure of some code I want to use.

It sounds like you’re using include() to load modules that you’re trying to use? I wouldn’t recommend doing that, for exactly the reason you specified. Julia has a LOAD_PATH variable you can set so that import and using can load modules which are not installed as packages. By using import you avoid ever having to know anything about the code’s directory structure. See https://docs.julialang.org/en/stable/manual/workflow-tips/#A-basic-editor/REPL-workflow-1 and Document that the REPL workflow assumes your module is in LOAD_PATH by rdeits · Pull Request #24223 · JuliaLang/julia · GitHub

The order one includes files is significant because one file may define stuff another one uses.

Also a great reason to use import when loading others’ code, rather than include()

An include file can be anything from a single function to a module to a whole lot of other include files and so it is easy for someone to lose control.

Again, just import or using the module so that you will always get a Module and nothing else

I don’t like the idea that if one wants to create other versions of a function he needs to import the original definition of this function.

If two packages define plot() and you want to extend one with a new method, how can the language possibly know which one you mean unless you specify it?

How am I supposed to know where the original definition is

Check out @which

and why should I care?

I guess I can’t really answer that.

The using statement essentially exposes the exported contents of a module to the code that follows. If this module is installed in Julia as a package things are simple enough. If not one must somehow access the module, an include statement perhaps, and then call using.

As above, you shouldn’t need to use include() for this. Instead push!(LOAD_PATH, "path/to/module"); using Module .

7 Likes

In my current convention I use include for all the files that form a particular module which do not contain other modules but only functions and types. I place these files in a specific directory with the name of the module and I put the main module file outside this directory:

src/MeshGeneation/
  mesh1.jl
  mesh2.jl
  mesh3.jl

src/MeshGeneration.jl 
with contents:

module Mesh generation

# Includes
include("MeshGeneration/mesh1.jl")
include("MeshGeneration/mesh2.jl")
include("MeshGeneration/mesh3.jl")

# Exports
export fun1, fun2, fun3...

end

src/MasterCode.jl
with contents:

module MasterCode

include("MeshGeneration.jl")
include(... other modules...)

end

src/main.jl

include("MasterCode.jl")
using MasterCode.MeshGeneration

I’m specifically recommending that you not do this. Instead, if your LOAD_PATH is set appropriately, then you can do:

import MeshGeneration
import OtherModule1
import OtherModule2

(or replace import with using as you see fit). This avoids the need to include() other modules, and ensures that you get exactly one module per import rather than whatever include() happens to provide.

1 Like

At some point I used the load path option but I had problems running the code from within an IDE. I was getting errors that Julia was unable to find the modules asking me to add them using Pkg.add(). I also had the same problem when trying to use the static julia compiler to create an executable since i do not care about “compilation during execution” and jupyter notebooks.

Where is one supposed to put these statements? I had them in the main executable file. What if I have modules which need to import other local modules to work and I am building a library?

At some point I used the load path option but I had problems running the code from within an IDE. I was getting errors that Julia was unable to find the modules asking me to add them using Pkg.add().

Most likely you just misconfigured LOAD_PATH or JULIA_PKGDIR environment variable. Without specific symptoms it’s impossible to figure out what exactly was wrong.

I also had the same problem when trying to use the static julia compiler to create an executable since i do not care about compilation during execution and jupyter notebooks.

Julia now precompiles all modules by default. Normally you:

  1. Decompose your code into logical modules. Something you do in any language anyways.
  2. Create a package for each module and put it to your JULIA_PKGDIR (~/.julia/v0.6 by default). You can publish your package to GitHub or keep it private.
  3. Create some Main package, put using A; using B statements to load your modules and implement application-specific logic.
  4. If you want to make a script from it, you can make something as simple as:
#!/bin/bash
julia -e 'using Main; Main.launch()'

A note on include(). Sometimes you want many functions to be logically in the same unit, e.g. module, class, etc., but don’t want to have files with 1000+ lines of code. The question is: how do you put code physically into separate files, but logically to a single unit?

Different languages solve this differently. In Python, for example, you usually make multiple submodules and then simply reexport functions you need. In C++ you can define the same namespace in several files. Scala guys use traits and so-called cake pattern to create a single class that “extends” its components.

Julia provides a very simple alternative: it lets you directly include files that you want to be parts of a single module. For example, in Espresso.jl I have > 4k lines of code split into 33 files, but a single module exporting ~90 functions. Of course, if you want more logical modules, you can create more packages or a single package with several modules, each module consisting of a single file or including multiple others - whatever structure you believe is appropriate for your project. But don’t mix up include and using/import - the first is for physical layout, the second is for logical units.

4 Likes

I wrote a package for developing packages in much the same manner you want.

Maybe check out Julz?

It autoloads directories and can work directly from the command line

( you probably have to checkout the master branch though :frowning: )

1 Like

This looks very interesting and I will certainly try using it. Thank you.

I don’t have a strong opinion on whether the module system of Julia needs revision, but I do strongly believe that the relevant section of the documentation needs to be rewritten. It starts out with some sketchy examples followed by a very deep dive into arcane technical matters pertaining to precompilation. There should be much clearer explanation of how the paths work, packages versus modules, typical use-cases, etc. I would volunteer to write it myself if only I understood it better.

8 Likes

My attempt at a gentler introduction is here. Corrections and improvements are always welcomed… it will need updating for v0.7/1.0 fairly soon though.

2 Likes

Hi,

I don’t mean to tell anyone how their modules, etc should work but reading diagonally I see some stuff about paths above so I thought I would interject with our approach for MATLAB. I’m a math prof.

I have found that installing custom packages (e.g. chebfun) in some global path creates significant obstacles to collaborations. This is because each programmer often has a slightly different version of chebfun in their path so what works for one person doesn’t work for anybody else. After a while, each person with some sort of zoo of custom packages installed in some path, is now programming a custom programming language that exists only for that one person on the face of the Earth. You also see this with people who have extensive custom LaTeX packages they’ve developed over decades and placed in some standard path they have.

Instead, what we do is, each for each new project, we put all the external dependencies inside that project’s directory. We don’t modify paths in a permanent, invisible way. Each package that has the unfortunate need to be on the path, we do an addpath() at the beginning of our scripts, but there is no savepath(). This is at odds with the standard installation instructions for chebfun but it works for us.

In brief, we don’t like to put nonstandard packages in an out-of-sight search path. For each new project, all the external dependencies are actually in that project’s directory and its subdirectories. Each new project starts with nothing but stock MATLAB stuff.

Well this is a deficiency of MATLAB not having a package manager. In any other language, the package manager keeps everyone up to date so this shouldn’t be an issue.

And this is due to MATLAB’s lack of proper namespacing. The issue with MATLAB is it’s missing every conceivable feature related to package development that it encourages bad practices to get around the lack of design, but I’m not sure how much these apply to other languages.

2 Likes

In any other language, the package manager keeps everyone up to date

Yeah about that. I gave up on maintaining numeric.js because updating my toolchain with all the package managers requires one week of work every time I want to roll out an update.

In addition to what @ChrisRackauckas said (solving this with a package manager), this can also be done simply with git repos. When I collaborate with people on software, we put code in a repo on Github, subscribe to notifications about updates, and update reguarly, eg every morning. This can be done with Pkg.update() (after the package was cloned) or directly with git.

This is, of course, an option, but the price you pay for this (as opposed to having those packages separate and track upstream) is that updating them becomes a major undertaking with a large fixed cost, and consequently you miss bug fixes etc. The Julia package ecosystem is moving pretty fast, so if you just don’t update for a few weeks you miss out on a lot.

Looking at the comments of @reits on the LOAD_PATH and the other points made above.
I work in HPC, and the ‘traditional’ method on HPC systems is to use the Modules environment to set up your environment.
(Not to be confused with Julia modules).
https://www.tacc.utexas.edu/research-development/tacc-projects/lmod
I would like to see some good practice in writing HPC Modules which set up Julia versions and Julia packages properly.
If anyone wants to collaborate on that send me a message.

Also which I am on this topic, has anyone worked on an Easybuild toolchain to install optimised Julia versions:
https://github.com/easybuilders/ easybuilld

(apologies for the space - the full URL mucked up my Reply button somehow)

If the modules are just for your own personal use, you can place the call to push!(LOAD_PATH,path) in the top level file as you suggested. I found that it is more convenient to place in the initialization file called juliarc.jl. However, if you are building a package, you would have to follow the advice given by @dfdx.

1 Like

Interesting. Just for my curiosity, could you please name the package manager for Fortran77?
Or for C, or C++?

Okay fine, but there’s a reason why these languages are considered hard to use. And this does encourage bad practices, with the infamous “every academic Fortran77 package is a single function” setup being true more often than not…

1 Like