How can I structure a project in multiple modules and have code run in them?

bertulli · June 13, 2023, 3:16pm

Hi all! Sorry this is probably noobish but I couldn’t figure it out easily from the docs/Google.
I’m developing a small project for my thesis. I generated a project structure (src/, Project.toml and Manifest.toml), and have some code inside src/. Until now, I manually loaded and executed code when starting my work session, but now I decided to better organize my code. I was advised to use modules, so I activated my project and added the dependencies, but now I couldn’t understand how to achieve what I need.

For context, the project is called ARMCortexM4Model, but the file ARMCortexM4Model.jl contains no real code, all the work is done in other files. For instance, I have created a module named Utils in src/Utils.jl containing some helper functions I need in other scripts.

Right now, to use it I need first to

include("Utils.jl")
using .Utils

IIUC I should do something like ] dev Utils to make Julia aware of the submodule, but I can’t:

(@v1.9) pkg> activate .
  Activating project at `~/onedrive/tesi/Julia_package/ARMCortexM4Model`

(ARMCortexM4Model) pkg> dev Utils
    Updating registry at `~/.julia/registries/General.toml`
ERROR: The following package names could not be resolved:
 * Utils (not found in project, manifest or registry)
   Suggestions: IOUtils GRUtils MLUtils PkgUtils EHTUtils DCAUtils pciutils_jll

(ARMCortexM4Model) pkg> dev .Utils
ERROR: Unable to parse `.Utils` as a package.

My guess is that Julia doesn’t know there are submodules, nor where they are, since they do not appear in the Project.toml. Do I have to edit that? How can I split code in different files otherwise? (and btw, is this even the idiomatic way to proceed?)

I want to use modules also to run some initialization code at start, but just once (a-là #pragma once in C/C++). For example, I need to read a large CSV file, and store it in a DataFrame. I so put the loading statement in another module, Constants.jl. The idea is to start each “working” script with using Constants and have the DataFrame loaded the first time I execute it, but not subsequent times. Can this be accomplished with modules, maybe, IIUC, using the __init__ functions?

Thanks!

fatteneder · June 13, 2023, 3:21pm

Create a file called src/ARMCortexM4Model.jl and fill it with

module ARMCortexM4Model.jl

include("utils.jl")
export Utils

end

Assuming your src/utils.jl looks like

module Utils

greet() = println("Hello world!")

end

You can then fire up a REPL and write

julia> using ARMCortexM4Model

julia> Utils.greet()

fatteneder · June 13, 2023, 3:28pm

I prefer a simple caching mechanism a la

using DataFrames

const df = DataFrame()

function loaddata(reload=false)
   if reload || isempty(df)
      empty!(df)
      # (re-)load your data into df
   end
   return df
end

And then just use the function whenever you need to grab the data.

bertulli · June 14, 2023, 9:19am

Thanks, this partially answers my question. But I wonder:

This way I need to load all the submodules at once inside ARMCortexM4Model. Can I selectively load just some of them, without modifying ARMCortexM4Model.jl every time? In other words, can I just say

using Utils

in a script, and

using Constants

in another one (so to decide every time which subset of helper functions get loaded)?

Is there a way to declare more than one module in a single project?
I read ] dev is used to develop a package. But now I don’t understand how: if I am bounded to develop a single package at a time, there’s actually no need to dev it in the first place. I thought I could dev all the modules I was working on, so that I can independently include them in my working scripts, and use Revise if I needed to modify the modules’ code. What did I understand wrong?

fatteneder:

I prefer a simple caching mechanism a la

using DataFrames

const df = DataFrame()

function loaddata(reload=false)
   if reload || isempty(df)
      empty!(df)
      # (re-)load your data into df
   end
   return df
end

Thanks, IIUC this also has the advantage of allow the re-loading of the data at your discretion. But just to better understand how modules work, can I do the same using “just” them?

Thanks!

fatteneder · June 14, 2023, 10:01am

Yes. The export Utils inside ARMCortexM4Model.jl only exports the name Utils, but does not export all the names Utils it self would export.
Hence, when you write using ARMCortexM4Model you can then use Utils.greet() instead of ARMCortexM4Model.Utils.greet(). If you also add export greet inside Utils.jl then you could write

julia> using ARMCortexM4Model

julia> using Utils

julia> greet()

Yes and no: A package (as generated by running Pkg.generate("MyPackage")) defined as a folder with a name MyProject containing a Project.toml file referring to MyProject (and optionally a Manifest.toml) and then a src/MyProject.jl file. Let’s call this a toplevel package, although it really is an environment (or project), I think, because it tracks all the dependencies used in it and it happens so that one if it is also called MyProject. Think of a project like a virtualenv in python, if you happen to know that …
There are then two ways to add more modules:

Recommended: Add submodules to MyProject as I did it above, which you can optionally reexport. See also the @reexport macro from Reexport.jl.
Add another package with Pkg.generate("MyPackage/Utils"). You must then tell your toplevel package to develop this package, e.g. run Pkg.develop(path="./Utils").

The dev is needed so that Revise.jl can track its code. The other alternative is to add a package, but for that to work the package needs to be git repository. The way of adding another module through dev is really only meant for local development of packages and its dependencies and/or mono repos like Makie where they develop multiple packages inside one toplevel package.

In case you haven’t discovered it yet, here is a link to the docs of Pkg which talks packages, environments a bit more: 1. Introduction · Pkg.jl

I also must admit I am not a big fan of this kind of project organization and I feel like I am seeing this question being asked way too often. Perhaps this is an indication that things could be simplified or made more user friendly, but then I did not think how …

Sorry, didn’t understand the question.

bertulli · June 14, 2023, 2:28pm

fatteneder:

Hence, when you write using ARMCortexM4Model you can then use Utils.greet() instead of ARMCortexM4Model.Utils.greet(). If you also add export greet inside Utils.jl then you could write
julia> using ARMCortexM4Model

julia> using Utils

julia> greet()

Thanks, this gets closer to what I want, which by the way is what you explained later .

I read some parts of Pkg’s documentation, but if you are referring specifically to the introduction no, I haven’t, funnily enough.

I mean, can modules be used also to initialize some variables/constants one-time-only? Sticking to my case, I need to export a DataFrame variable (easily doable in a module), but it needs to be constructed at runtime, it’s not embedded in the code. If I put this code in a module, and then load it, I should obtain what I want. But I also read about the __init__ function. Don’t they do the same thing (initializing things)? If not, what is the difference? And if yes, why do the __init__ function exist?

Thanks!

fatteneder · June 14, 2023, 2:46pm

Yes, you could write something like

module Utils

using DataFrames

const df = DataFrame()

function __init__()
   # load something into df
end

end

Whenever you then run using Utils in a fresh REPL it should load the thing.

nsajko · June 14, 2023, 3:56pm

I’m not sure whether there are any unanswered questions left, however, I suggest using the PkgTemplates package for generating package templates. It’s used like so, followed by an interactive process:

using PkgTemplates
generate()

Furthermore, if you want an example of a package whose code is tidily structured into modules, you can take a look at my package here: Neven Sajko / FindMinimaxPolynomial.jl · GitLab

Also, for your future questions on Discourse, I suggest explaining your problem in more detail, it will allow people to help you better. And it would probably be good to open a separate thread for each separate question.

bertulli · June 14, 2023, 6:50pm

Thanks, I see it more or less work as @fatteneder explained. I see that you have at top of src/Minimax.jl:

# Copyright © 2023 Neven Sajko

module Minimax

import
  MathOptInterface,
  Polynomials,
  ..NumericalErrorTypes,
  ..NumericalErrors,
  ..PolynomialPassingThroughIntervals,
  ..LinearInterpolation,
  ..ApproximateInfinityNorm,
  ..IntervalHelpers,
  ..CompressedPolynomials

How can you import modules defined in other files, without having included them beforehand?

Thanks, I’ll try . I put similar questions in a single thread because I thought they were sufficiently tightly-related to be explained in a single context (in other words, I thought having one thread per question would be redundant, since, as the questions were quite close to each other, the same person answering to one would probably have known the answer to the other)

fatteneder · June 14, 2023, 7:27pm

The way this works is that it is assumed that the module Minimax is included in another module which also contains the modules NumericalErrorTypes, NumericalErrors, etc.. This is why those a prefixed with .. which could be understood as “go up one level and look for NumericalErrorTypes, ...”

nsajko · June 15, 2023, 10:05am

Yeah, it’s as fatteneder said, all the includes are in the root module, whose file, FindMinimaxPolynomial.jl, just looks like this:

module FindMinimaxPolynomial

include("NumericalErrorTypes.jl")
include("NumericalErrors.jl")
[...]
include("CompressedPolynomials.jl")
include("Minimax.jl")

end

The includes effectively include all the submodules’ code directly within the FindMinimaxPolynomial.jl file.

The submodules then just refer to other modules in the hierarchy.

One awkward thing to keep in mind is that the includes have to be ordered according to the module dependency graph. For example, NumericalErrors depends on NumericalErrorTypes, so include("NumericalErrorTypes.jl") has to come before include("NumericalErrors.jl").