Efficient global parameters

I’m getting back into Julia after a long time not using it, so this is a fairly basic question.

I have a cell in my notebook something like this:

parameters = [
   [1.0, 2,0],
   [3.0, 4.0],
   [5.0, 6.0],
   [7.0, 8.0],
   [9.0, 10.0],
]

I realised this is probably going to give me a performance hit due to being an untyped global variable, but I’m unsure of the best way to fix this.

The obvious option is to declare it a const, but that means I have to restart the session every time I want to change it, which I’d prefer to avoid. Other than that, the general advice is to avoid global variables — but I’m unsure how to do that in this case, since there are several other cells in the notebook that need to refer to this array of parameters.

So is there a way to avoid globals in this case, allowing me to specify the type of the variable and get the performance benefit, while still being able to define it in one cell of a notebook and access it in others?

I found the Parameters package, but it seemed like it was actually solving a different problem from this one, to do with initialising tuples with keyword arguments. (Or am I wrong about that?)

1 Like

There are two straightforward ways to avoid the performance penalty of non-const globals; type assertions and function boundaries. Consider these example functions:

using BenchmarkTools
parameters = [[1.0, 2,0],
              [3.0, 4.0],
              [5.0, 6.0],
              [7.0, 8.0],
              [9.0, 10.0]]

function f1()
    s = 0.0
    a = parameters[1][1]
    for i = 1:1000000
        s += a
    end
    return s
end

function f2()
    s = 0.0
    a::Float64 = parameters[1][1]
    for i = 1:1000000
        s += a
    end
    return s
end

function g(a)
    s = 0.0
    for i = 1:1000000
        s += a
    end
    return s
end

function f3()
    return g(parameters[1][1])
end

If we time them

julia> @btime f1()
  13.503 ms (1000001 allocations: 15.26 MiB)
1.0e6

julia> @btime f2()
  886.341 μs (1 allocation: 16 bytes)
1.0e6

julia> @btime f3()
  886.262 μs (2 allocations: 32 bytes)
1.0e6

we see that f1 is really slow because of the use of the non-const global in the hot loop. This penalty is gone in f2 because we have promised the compiler that a is a Float64 within the loop (and you will get an error if parameters[1][1] actually is something else). Also f3 is fast because g is specialized to the type used in the call.

To be clear you still have to pay a price for the non-const global, but you do it once instead of a million times inside the loop.

3 Likes

You can also just pass the global variable as an argument to your functions. That allows you to have them in one cell, while using them in functions as local variables (they don’t get copied).

5 Likes

Just define the parameters as const and you are fine…

You can still change them, only the type will be constant. For example you can still do:

parameters[2]=[3.0,4.0]

Or if you want to change all parameters in one go you can do:

parameters .= [
          [1.0, 2,0],
          [3.0, 4.0],
          [5.0, 6.0],
          [7.0, 8.0],
          [9.0, 10.0],
       ]
2 Likes

Note that in the upcoming v1.8 (release candidate already available for download), you can add type annotations to your global variable (see PR #43671) to bypass this issue.

3 Likes

This really is the best solution – global variables are bad for perf, but they also just make reasoning about code hard unless you know all of the code already. For example, a repeating problem in this forum is someone posting a function that clearly depends on variables they’re not taking in as arguments, but the poster doesn’t realize they need to give examples of how those unspecified arguments are set before another person could help them debug.

4 Likes

People mix this up all the time, but you can’t change the binding of a const global (however you can mutate the contents, which is what you do in that post). See Values vs. Bindings: The Map is Not the Territory · John Myles White (written by another poster in this thread!).

E.g.

julia> const x = 1.0
1.0

julia> const x = 2.0
WARNING: redefinition of constant x. This may fail, cause incorrect answers, or produce other errors.
2.0

doesn’t change type, but does cause a warning. As @Liozou pointed out you can do

x::Float64 = 1.0

to define a typed global, where you can rebind it, but cannot change type.

2 Likes

I agree with this take. Defining things in global scope is not bad and in some types of work is a pain to avoid. You just need to be sure that any time you use those variables, you are actually using them in local scope! That is, you need to explicitly pass them to functions, etc.

I hate globals as much as the next person, but the time to first plot necessitates using notebooks, and IMHO it’s kinda hard to get the most out of notebook-based coding without using globals - the whole notebook system seems to be designed around them. If it wasn’t for plotting I’d code in .jl files and never use global variables at all.

But let’s not let the discussion slide too far into the pros and cons of globals in general. I think being able to declare the types of globals will make a big difference for notebook-based coding, so I’m looking forward to 1.8. All the answers here were really helpful, I learned a lot from them. Thank you very much!

I take your point on not derailing the thread too much, but I’m not sure I see how time to first plot is relevant to which coding environment you work in? Also I don’t see why notebooks necessitate globals, I write functions all the time in notebooks.

Finally, if you find time to first plot to be a real issue in day-to-day work consider just making a sysimage with your favourite plotting package in it and a few plot(...) calls to eliminate it.

2 Likes

Oh, this is going to get even a bit more derailed now, but time to first plot is relevant because I want to do some calculations and then plot them. If I do that in a script it’s hard to iterate on my code, because every time I run it I have to wait for Plots to start up, whereas if I do it in a notebook then I only have to wait once and then I can iterate all I want. It doesn’t make much difference once everything’s all done and working and I just want to run it, but when iterating during the development process it makes enough of a difference to push me into using notebooks, which I otherwise wouldn’t do.

I guess I could code in a script and then import the script into a REPL environment instead of starting a new julia instance every time I run it, but I guess I just figured if you’re going to be running a REPL anyway you might as well use a notebook instead and get the advantages of that.

Maybe I’ll look into making a custom sysimage. I haven’t really looked into that before - it seems like there might be some disadvantages to that option as well. I’m not unhappy with using notebooks, generally speaking.

My preferred workflow is writing .jl files that I edit in VSCode. I then evaluate the current line, or current selection, or current cell (cells can be created, delimited by ##) as needed. The VSCode window has an integrated Julia process that remains open, often over days and weeks.

Have you tried VSCode?

And it has a sysimage-per-environment workflow built in:

https://www.julia-vscode.org/docs/dev/userguide/compilesysimage/