Why doesn't this "include-guard-ish" code evaluate in the correct order?

Hi all!

So I was trying to implement a mechanism similar to a very simple C-like include guard:

# file my_script.jl
if @isdefined my_script_provided
    print( "my_script.jl already loaded")
else
    # my code here
    my_script_provided = true
    print("Done!")
end

In such script I load some packages (via usings) and some data in some DataFrames. Point is, I have this include dependency:

data_analysis.jl ---------> constants.jl ------+---->common.jl
                                               |
                                               |
data_extraction.jl ----------------------------+

where

  • constants.jl :
if @isdefined constants_provided
    print( "constants.jl already loaded")
else
    print("constants.jl not loaded. Loading...")

    include("common.jl");

    # other code

    df_model_master[!,mean_power_sym] = df_model_master[:,mean_power_sym] .* u"W";
    df_model_master[!, std_power_sym] = df_model_master[:, std_power_sym] .* u"W";
    df_model_master

    constants_provided = true
    print("Done!")
end
  • common.jl :
if @isdefined common_provided
    print( "common.jl already loaded")
else
    print("common.jl not loaded. Loading...")

    using CSV;
    using DataFrames;
    using Statistics;
    using Unitful;
    using Plots;
    using Plots.PlotMeasures;
    using Debugger;

    # other code

    common_provided = true
    print("Done!")
end

However, when running as a script the “client” code data_analysis.jl, I got this error:

ERROR: LoadError: UndefVarError: @u_str not defined
Stacktrace:
 [1] top-level scope
   @ :0
 [2] include(fname::String)
   @ Base.MainInclude ./client.jl:476
 [3] top-level scope
   @ ~/tesi/heap_lab/setup_heap_lab/Julia/data_analysis.jl:5
 [4] include(fname::String)
   @ Base.MainInclude ./client.jl:476
 [5] top-level scope
   @ REPL[1]:1
in expression starting at /home/alessandro/tesi/heap_lab/setup_heap_lab/Julia/constants.jl:50
in expression starting at /home/alessandro/tesi/heap_lab/setup_heap_lab/Julia/constants.jl:5
in expression starting at /home/alessandro/tesi/heap_lab/setup_heap_lab/Julia/data_analysis.jl:5

where

  • data_analysis.jl:5 -------> include("constants.jl");
  • constants.jl:5 -------------> if @isdefined constants_provided
  • constants.jl:50 --------------> df_model_master[!,mean_power_sym] = df_model_master[:,mean_power_sym] .* u"W";

It’s like the Unitful should have been loaded, but it’s not.

Instead, If I manually call the code in the correct order, it doesn’t throw an error.

Why is it?

Thanks!

First thing I have to ask, is this needed in some way for your project to work or why do you want to structure your code like this?

Second, it would be a lot easier to debug if you provided a runnable example.

I tried to recreate something small here, though this seems to work as expected. So I’m having a hard time doing more with what you gave me.

common.jl

if @isdefined common_provided
    println("common.jl already loaded")
else
    println("common.jl not loaded. Loading...")
    using Statistics

    common_provided = true
    println("Done!")
end

constants.jl

if @isdefined constants_provided
    println("constants.jl already loaded")
else
    println("constants.jl not loaded. Loading...")

    include("common.jl");

    constants_provided = true
    println("Done!")
end

other.jl

include("common.jl")

data = rand(10)
println("Mean $(mean(data))")

run.jl

include("constants.jl")
include("other.jl")

println("Std $(std(data))")
2 Likes

In order: no, it’s not a requirement, it’s just that I want the “setup code” to be called automatically when I start my Julia session, but not more than once every time I run a script (I work from Emacs and I tend to keep the same REPL open for all the session). Moreover, a script of mine (what I called data_extraction.jl in the example I gave) needs to be run with the data provided by common.jl, but not the one from constants.jl. Of course, if there is a better/more idiomatic way to accomplish this that I’m not seeing I’m really welcoming it.

As for the MWE I didn’t post all of the scripts to not clutter too much the post, but of course I can provide them if you think that’s going to help. Anyway the relevant part are those.

I suppose the reason your example works is that you don’t call anything containing the @u_str macro, i.e. the Unitful package. Can you tell me if that’s the point of failure also for you?

I’m not sure, probably depends on exactly what workflow you want.

Am I right in that you want to load common.jl every time you start Julia (in this project only?) since it contains some imports and data you always want to use? And then you have other scripts you might run interactively in the REPL, where many of them try to include (either directly or indirectly) common.jl in some way and you don’t want the common.jl code to rerun if it was already loaded?

  1. Why should common.jl not be reloaded? Is it that the data shouldn’t be reloaded?
  2. If you know that you always load common.jl at start, why do the other scripts need to load it? Is it that you want it to be possible to run the scripts as julia data_analysis.jl so the includes have to be there?
  3. Does data_extraction.jl and data_analysis.jl not have anything to do with eachother?

I don’t think it is Unitful by itself at least, since I can create a workflow that seems similar to the one you mention.

common.jl

if @isdefined common_provided
    println("common.jl already loaded")
else
    println("common.jl not loaded. Loading...")
    using DataFrames, Unitful

    df_model_master = DataFrame(a=1:4, b=randn(4))

    common_provided = true
    println("Done!")
end

constants.jl

if @isdefined constants_provided
    println("constants.jl already loaded")
else
    println("constants.jl not loaded. Loading...")

    include("common.jl");

    df_model_master[!,:a] = df_model_master[:,:a] .* u"W"
    df_model_master[!,:b] = df_model_master[:,:b] .* u"W"

    constants_provided = true
    println("Done!")
end

data_analysis.jl

include("constants.jl")

@show df_model_master

With these files I can now start the REPL and first include common.jl and the run the data_analysis.jl without any problem.

julia> include("common.jl") # Load common first
common.jl not loaded. Loading...
Done!

julia> include("data_analysis.jl") # Load data_analysis, common won't be loaded again
constants.jl not loaded. Loading...
common.jl already loaded
Done!
df_model_master = 4Ă—2 DataFrame
 Row │ a          b
     │ Quantity…  Quantity…
─────┼──────────────────────────
   1 │       1 W     0.665569 W
   2 │       2 W  -0.00197223 W
   3 │       3 W     -1.10029 W
   4 │       4 W      2.14784 W
4Ă—2 DataFrame
 Row │ a          b
     │ Quantity…  Quantity…
─────┼──────────────────────────
   1 │       1 W     0.665569 W
   2 │       2 W  -0.00197223 W
   3 │       3 W     -1.10029 W
   4 │       4 W      2.14784 W

For posting the full code, I think it would be nice if you could try to create an MWE, which most likely is not going to be the full files you have, but rather something smaller that is stripped of the things that are not necessary to recreate the strange behaviour (Minimum Working Example).
I took some time to do this guesswork, but as we see it didn’t really get us anywhere, and in general you are much more likely to get people to help you if you already provide an MWE since that makes it much easier to both understand the problem and debug it.
Many times I would also say that the process of creating an MWE can help you solve the problem yourself.

C++ compiler vendors are still struggling to implement C++20 modules in 2023, and here you are in Julia, trying to use includes instead.

I strongly suggest putting this code into modules.
You can then using the module as many times as you’d like; the header-guards are built in.
The module will also be precompilable.

To put these in a load path, I’d put them in a package and ] dev it. Creating packages is easy with:

You could create one package for each of common, constants, etc. Or one package with multiple modules, if that’s what you’d prefer.

Either way, I’d definitely use modules instead of trying to use header guards. That’s why we have them, and why languages like C++ are trying to transition to them.

EDIT:
I also avoid the PktTemplates wizard, because it is nice to be able to just hit up in the search history and get a generic template. E.g., typing t = PkgT and hitting up, I get:

t = PkgTemplates.Template(user="chriselrod",plugins=[Git(ignore=["*#*", "*~", "*#*#"],branch="main",ssh=true),CompatHelper(),GitHubActions(),Codecov(),Documenter{GitHubActions}()]);
PkgTemplates.generate(t, "MyNewPkgName") # PkgT [up]

Most people will probably want to use a different user=.

So creating a new package only takes me a few seconds. The skeleton is all there, so all that’s left is deving it and filling out the code.

6 Likes

Correct!

I’ve seen problems in assigning units from Unitful multiple times, but that is a fairly minor problem. The major drawback is that as my input data grow, common.jl and constants.jl will have to load huge chunks of data, and it will become progressively slower. Granted, it can still be done, but I wanted to optimize (the same reason why I’m using a sysimage rn).

Correct.

In a way, no: data_extraction.jl populates a CSV with the data I need, and data_analysis.jl is (one of the many) analysis script I wish to run. Since computing the CSV is expensive, I run data_extraction.jl only periodically, while data_analysis.jl should load the saved version of the CSV. Since I do this in constants.jl, I (think I) need to NOT execute constants.jl when I extract the new data.

Me too! I’m sorry, I realize I haven’t pointed out this before. The problem arises when I try to include constants.jl from data_analysis.jl, that in turn should include common.jl. If I try to manually load in the “correct” order first common.jl, then constants.jl, then data_analysis.jl, all works well. It seems like it’s the added layer of indirection causing problems.

You’re right, I apologize if I haven’t been clear before. The files I provided contains the lines that seems to cause the problem (at least, I tried to pinpoint them down). I see that it wasn’t a correct MWE, since it correctly works for you, so I’ll try to provide a more complete one.

I didn’t realize I could use them, you’re right! I didn’t resolve to them also because I thought them to be an overkill for my problem, but from what I hear now they are EXACTLY what I need. Thanks!


MWE:

  • common.jl :
if @isdefined common_provided
    print( "common.jl already loaded")
else
    print("common.jl not loaded. Loading...")

    using CSV;
    using DataFrames;
    using Statistics;
    using Unitful;
    using Plots;
    using Plots.PlotMeasures;
    using Debugger;

    instruction_col_title = "Instruction"
    mean_power_col_title = "Base power mean (W)"
    std_power_col_title = "Base power std (W)"

    common_provided = true

    print("Done!")
end
  • constants.jl :
if @isdefined constants_provided
    print( "constants.jl already loaded")
else
    print("constants.jl not loaded. Loading...")

    include("common.jl");
    instruction_sym = Symbol(instruction_col_title)
    mean_power_sym = Symbol(mean_power_col_title)
    std_power_sym = Symbol(std_power_col_title)
    df_model_master = DataFrame([[1,2],[3,4],[5,6]],[instruction_col_title, mean_power_col_title, std_power_col_title])
    df_model_master[!,mean_power_sym] = df_model_master[:,mean_power_sym] .* u"W";
    df_model_master[!, std_power_sym] = df_model_master[:, std_power_sym] .* u"W";

    constants_provided = true

    print("Done!")
end
  • main.jl :
include("common.jl")
include("constants.jl")
  • run trace:
[alessandro@commodoroII MWE]$ julia main.jl # the "correct" order
common.jl not loaded. Loading...Done!constants.jl not loaded. Loading...common.jl already loadedDone![alessandro@commodoroII MWE]$ 
[alessandro@commodoroII MWE]$ julia constants.jl
ERROR: LoadError: UndefVarError: @u_str not defined
in expression starting at /home/alessandro/onedrive/tesi/Julia/MWE/constants.jl:11
in expression starting at /home/alessandro/onedrive/tesi/Julia/MWE/constants.jl:1[alessandro@commodoroII MWE]$ 

That certainly seems like the way to go!


I think I understood why you got the error though. I minimized it a bit to a single file where the error only depends on if the using statement is inside of the ìf` statement.

In this case with Unitful, it errors if the using outside of the if statement is not there, the inner one does not seem to matter.

using Unitful # Remove this and it errors
if true
    using Unitful # This one does not really matter for the outcome
    a = 1u"W"
end

But running a similar thing with DataFrames it only errors if neither of them are there, so it seems to work to have the using inside the if.

using DataFrames
if true
    using DataFrames
    a = DataFrame(a = [1, 2, 3])
end

The difference here seems to be the macro vs function, and trying a few more this seems to be the case. I assume that this is because the macro is computed before running the logic, and so the macro is not yet imported since the if statement is not evaluated yet? Though in this case the macro would never run without the macro being imported, but that would probably quickly become messy to try and prove for arbitrary cases…

2 Likes

I see, so it’s a matter of how macros are brought into scope during evaluation of modules. It’s a fix way too advanced for my knowledge, so I think I’ll just use modules, which are also the idiomatic way to go. Anyway, thanks to both! :smile::pray:

I’m sorry, I just can’t figure out a thing.

I have created a new package with PkgTemplate, then ] dev . it. Now I can use import <my-package-name>, but IIUC this only loads the file src/<my-package-name>.jl. In my case, to decide in each file wether to load common.jl or constants.jl, do I need to create a package for each one?
I.e., I can’t see how having multiple modules in one package would work, without having to do something like (I have already refactored common.jl → Commons.jl and structured it into a module)

include("Commons.jl")
using Commons

Is this what you were referring to? Is there a performance penalty in including each time the module?

So creating a new package only takes me a few seconds. The skeleton is all there, so all that’s left is deving it and filling out the code.

You probably know, but can use Develop() in the template to cut that down to just writing the code :slight_smile:

1 Like