You didn’t include the global variable in the definition of tst2
.
I am sure @dpsanders is much more knowledgeable than I am, but I find necessary to point that
The word
const
is really a misnomer. It is not that the variable is constant. It is the variable type that is constant, which is quite different.
Is just completely wrong. The binding is constant, not the type. What difference this make? Exactly what @dpsanders pointed out, but trying to explain more exhaustively I would say: if the binding var1
is constant, then you cannot assign a new value/object for var1
(even of the same type), you can change a mutable field inside that value, but you cannot ever replace the entire object inside var1
by another.
This is: if tmp1
is an array, then you can add, remove, and replace elements; but you cannot never replace the original array by a brand new array (for example, to “empty” the array temporarily while keeping the old array in an auxiliary variable, with the intent of putting it back after, without having to make any copies, a swap
basically); it is not the type that is fixed, it is that initial object (which can have mutable fields) that is fixed there forever. If it is an Int
like in your example, this means you cannot never change that value (because an Int
has no mutable fields inside it you can change). That name will point to the same ‘memory position’/reference/value its entire existence.
I admit I am a beginner in Julia, which is why I am posting. Consider my duplication of the example provided by @dpsanders :
julia> const cc = 3
3
julia> const cc =5
WARNING: redefining constant cc
5
WARNING: redefining constant zzz
Note that the printout is different and my constant changed. I am using Julia 1.4.0-rc1.0. His example is copied from his message above (what version are you using, @dpsanders?)
julia> const V1 = 3
3
julia> const V1 = 5
WARNING: redefinition of constant V1. This may fail, cause incorrect answers, or produce other errors.
5
julia> const V1 = 5.0
ERROR: invalid redefinition of constant V1
Stacktrace:
[1] top-level scope
@ REPL[12]:1
To edit a specific method, type the corresponding number into the REPL and press Ctrl+Q
This is just a convenience to be used in the REPL instead of re-starting it. If you check the manual section on constants you will find:
Note that although sometimes possible, changing the value of a
const
variable is strongly discouraged, and is intended only for convenience during interactive use. Changing constants can cause various problems or unexpected behaviors. For instance, if a method references a constant and is already compiled before the constant is changed then it might keep using the old value: […]
In other words, the documentation gives no guarantee that it will work, and if it works there may be hidden problems. I really would have preferred that the behavior was just always throwing an error, as this ends up causing more headaches than just having to re-start the REPL, if someone does this without reading the manual.
Thank you. So the bottom line seems to be (as I have read in several places):
(this does make a greater impact when one is actually programming as opposed to simply reading.)
- do not use globals
- if you use globals, the preference is not to use const
- if efficiency is the desired outcome, use structs to store any global parameters
- the preference is to use immutable structs for maximum efficiency.
- Minimize the number of parameters defined outside a function.
So my remaining question is: if I use a struct, and initialize a structure in the global space, must I make it a const for maximum efficiency? Initializing in the global space is the only solution UNLESS one uses command-line arguments or inputs data from an input file (which is likely the best and more flexible approaches).
Did I miss anything important?
Thank you, again.
About your list, well, let me give you my more nuanced (and incredibly verbose) vision:
- Avoid globals. Specially for parameters. Just use a Dict, or create your own struct and pass it along (there some package solutions too). I use globals. I have a global timer in my module, so I can reset it, run a bunch of my methods, and then get how much was spent in each method (and each method called inside them). It is the best solution? No. But it is good enough and used only for debug, not for anything that will get in a paper.
- Considering that you will use globals. If you may need to change the entire object including its type: do not use
const
(and take the performance hit). If the type never changes, but you may need to change the whole object, not just a field, then use something likeconst var = Ref(10)
for storing anInt
, this creates aconst
binding to a reference (you need to use[]
to access, or set, theInt
value), so you get type-stability and gets to change the value safely. If you have a lot of these globals (type-fixed but with an object that may need to be wholly replaced), then instead of usingRef
you may also declare your own mutable struct and use it withconst
and without theRef
(because the fields are mutable and can be changed safely). If you are completely sure you will never need to replace the entire object but only mutate a fields of it (or call mutating methods over it), then you useconst
(directly over the value, not wrapping inside a Ref). - I do not think the main point of using
structs
is efficiency, but just that you are following (1) by doing this. Unless you are talking about using a global struct to store the values, I have addressed this in the point above. Basically, ifconst
was used (one way or another) then the performance of the global variable will not be so much different from the passed-along parameters (but I welcome someone showing me wrong). - If you have a grouping of fields that never change (or change very rarely), or they change but almost all fields in the same point (so it is almost the same cost of creating the struct again), then immutable structs are a good idea. If you need to consistently change the values of single fields, not so good. If you actually have a loop that needs to change a single field in that structure multiple times, then surely use a mutable struct.
- Yes. This is basically (1) again.
I addressed your question about structs and global space in (2) I think.
I am not sure if I understand the “is the only solution UNLESS” bit. This is a little crude, but why do you not just gives every method a last parameter called params
that is either a Dict
, a NamedTuple
, or your own struct
and each method pass it along for the methods it calls? Then you can have a main function like:
function main()
# initialize your struct/dict/tuple here
params = ...
# call your method with such params
my_method(..., params)
end
main()
And instead of changing global variables change the definition of params
.
This really depends on your application. If the value is actually a constant, use const
. If you want a mutable container, also use const
.
Again, if you need mutable struct
because you change those values, that is fine too.
If the function gets the value as a parameter, then it does not matter. Eg like this:
function foo(parameters)
...
end
par = parse_from_args(ARGS)
foo(par)
Great answer!!! Yes, my answer was very crude in comparison. I will perhaps perform some experiments. My preference is not to use goals. Passing known types via function arguments is my preference.
Thanks again! I hope this helps others. I have nothing to add.
Gordon
That is good to know! However, wouldn’t that only be true if the parameter values had specified types? For example:
a = 35
function anyfunc(a::Int64)
...
end
would be as efficient as using const a
since the function knows the type of its argument. However,
a = 35
function anyfunc(a)
...
end
would be less efficient, since the type of a could change and the compiler has to take that into account. Whatever the answer, I assume that the more information is known to the compiler, the higher likelihood of improved efficiency.
No, not at all. With few exceptions, the function signature has no effect on the performance of the code within that function.
No, it’s not like that. Julia does just-in-time (JIT) complilation. This means that the compilation of functions does not happen when you define them, but when you run them. And when you run anyfunc(a)
, the type of a
is always known!
So, in your first example:
function anyfunc(a::Int64)
...
end
that won’t be compiled until it is called for the first time. It’s exactly the same with:
function anyfunc(a)
...
end
The difference is that in this second version, if you do anyfunc(1.0)
it will compile a new, different code for Float64
arguments, while in the first version it will throw an error – unless you defined a different method that allows that type.
In general, performance is more affected by the code inside the function (e.g. ensuring type stability) than by the prior knowledge of the arguments types.
In addition to the excellent answers by others: I think you misunderstand how Julia’s compilation model works, which will make it difficult to write performant code. I would recommend reading the whole of
https://docs.julialang.org/en/v1/manual/performance-tips/
and then
IMO it is really worth investing in this.
Thank you, Tamas! I will certainly do so. My codes are already very performant, but one can always improve.
At the moment, I work completely within functions, do pre-allocation, use the “.” operator as needed, etc, all based on discourse discussions and reading documentation. But of course, there is understanding and there is grokking.
I do agree that I do not have a good understanding of how compilation work in Julia, but I do understand lazy compiling, late binding, etc
I will return …
I have not tested it yet, but I think that Dicts for parameters are slower than (concretely typed) structs:
- For each Dict access, the key needs to be hashed and a lookup performed. Lookups of values in structs should be much faster.
- If you have multiple parameter types, the Dict will no longer have a concrete value type (worst case Any), thus the type of any elements in this Dict cannot be determined at Compile-time anymore.
“2. If you have multiple parameter types, the Dict will no longer have a concrete value type (worst case Any), thus the type of any elements in this Dict
cannot be determined at Compile-time anymore.”
In that case, one solution is to have multiple dictionaries, one for each type. But it is still not fully clear.
If the function accesses specific elements of the dictionary in the function, wouldn’t the compiler know the types?
If access to the variables were via a loop, then the knowledge is lost. Ultimately, it is all about the level of sophistication of the compiler. I recognize that these issues can be quite difficult.
param = Dict(:a => 2.3, :b => 5)
function tst(p)
a = p[:a]
b = p[:b]
end
tst(param)
Doesn’t the compiler know the types of the variables a
and b
?
No. p
is a Dict{Symbol, Real}
, so the compiler can only assume that a
and b
are both some subtype of the abstract type Real
. This is easy to check for yourself:
julia> @code_warntype tst(param)
Variables
#self#::Core.Compiler.Const(tst, false)
p::Dict{Symbol,Real}
a::Real
b::Real
Body::Real
1 ─ (a = Base.getindex(p, :a))
│ %2 = Base.getindex(p, :b)::Real
│ (b = %2)
└── return %2
Given the entire program as you’ve written it, a hypothetical compiler could perform enough constant propagation to turn your entire code into return 5
, but the compiler we actually have isn’t going to do that for you.
Got it. Thanks! It does all make sense.
Gordon
Base.@kwdef
is really useful for the definition of parameter structures, it allows to define default values directly in the structure definition.
Sorry, my home lost electricity for the last eight hours. When I said:
This is a little crude, but why do you not just gives every method a last parameter called
params
[…]
I meant that my suggestion was a little crude, because it was vague and did not point the difference between each option. I was not referring to your comment.
I worry a little about your example:
a = 35
function anyfunc(a)
...
end
Seems like you think that there is a connection between the global variable a
and the parameter a
, the two of them only share the same name (and therefore, when you reference a
in your function it will always refer to the parameter, not the global variable), but they are two completely distinct things. They only become the same thing if you call anyfunc
passing as parameter the global a
.
Yes. I am absolutely sure of it. However, in an prototype phase, it can be more convenient to use a Dict
and do not worry about adding and removing fields (and changing the construction of the object) each time something changes. I was just suggesting things that work, with different degrees of flexibility and performance. I use a Dict
myself, because I get it from the argument parser, and the performance of accessing the parameters is completely irrelevant in my case (the code takes minutes to hours, most time from Gurobi solver, the access to parameters do not account for 0.01% of the total time, probably orders of magnitude less).
As already answered this is not guaranteed. However, if you use a Dict
you should probably never access a parameter repeatedly inside a loop, but instead do: param_name :: TypeOfParam = params_dict[:param_name]
before such heavy use of the parameter. As I said above, use a Dict
only if the convenience matters more, and you do not have to use a lot of the parameters each iteration of tight loops (you can unpack all values as I suggested above but at this point is probably better to just use an immutable struct
and access the fields directly).