I am trying to pass an object inside a function I optimize with Optim.jl, mutate it, and access the result in the next iteration.
While testing, I noticed that I can change elements of an array, defined outside of the function I want to optimize, from inside the function I optimize. However, I cannot do the same with an integer object.
Here is an example:
# The option I want to optimize
function opt_fnc(x,iter)
y = x[1].^2 + x[1].*2
# Show iteration counter
println(iter)
# Add one to iteration counter
iter += 1
# Return function value
return y
end
# Initialize iteration counter
iter = 1
# Define starting value
x0 = [1.0]
# Optimize function
optimize(x -> opt_fnc(x,iter), x0, BFGS())
This returns 1 for each iteration. However, if I define iter as a 1x1 array, it stores the change for the next iteration:
# The option I want to optimize
function opt_fnc(x,iter)
y = x[1].^2 + x[1].*2
# Show iteration counter
println(iter)
# Add one to iteration counter
iter[1] += 1
# Return function value
return y
end
# Initialize iteration counter
iter = [1]
# Define starting value
x0 = [1.0]
# Optimize function
optimize(x -> opt_fnc(x,iter), x0, BFGS())
Since iter is passed as a parameter to opt_fun, the reference to iter in iter += 1 is to the parameter and not the global variable. The Int parameter is copied each invocation. The variable iter “shadows” the global variable of the same name.
In the case of the array, the iter array points to the same allocated block of memory and is unchanging during execution (whether copied or shadowed). The changing element is iter[1].
The first Int version would work if the iter parameter to opt_fun is removed, and the function accesses the global variable directly (which it is allowed to). Using global variables is not recommended though. And if done, it is best for performance to define them in a typed manner: iter::Int = 1.
That makes sense. Thanks!
Now, if iter was not just an iteration counter but, e.g., a large array, I’d have to change all elements in a loop, right? Or is there a better/faster way?
So, I’d have to do this:
# The option I want to optimize
function opt_fnc(x,iter)
y = x[1].^2 + x[1].*2
# Show current status
println(iter)
# Add a random number to each element of the large array
for i in eachindex(iter)
iter[i] = iter[i] + rand()
end
# Return function value
return y
end
# Initialize some large array
iter = [1.0 2.0]
# Define starting value
x0 = [1.0]
# Optimize function
optimize(x -> opt_fnc(x,iter), x0, BFGS())
But that is probably quite slow and something like this would be faster
# The option I want to optimize
function opt_fnc(x,iter)
y = x[1].^2 + x[1].*2
# Show current status
println(iter)
# Add a random number to the large array
iter = iter .+ rand(Float64 ,(2, 1))
# Return function value
return y
end
# Initialize the large array
iter = [1.0 2.0]
# Define starting value
x0 = [1.0]
# Optimize function
optimize(x -> opt_fnc(x,iter), x0, BFGS())
It is not exactly the same thing because you lack a .. The line:
iter = iter .+ rand(Float64 ,(2, 1))
changes completely the iter vector by a new one (and so it allocates, will be slow, and is proabbly not what you want), while :
iter .= iter .+ rand(Float64 ,(2, 1))
is more close in meaning to the loop, but the rand call still allocates as it needs to produce a vector of size 2 and will not participate in loop fuzing. I think that
iter .= iter .+ rand.(Float64)
will do the trick for the full loop fusion and no allocations (i am not sure), but i KNOW that the best chance you have for optimal code is to write the loop yourself:
for i in eachindex(iter)
iter[i] += rand()
end
Anyway, my rule of thumb is hat using a loop is usually faster in Julia, on the contrary to e.g., R, so if you can write the loop explicitly, you should probably do it. Moreover, loops are easier to reason about
Consider also passing a struct into opt_fun with the data inside. This would avoid global variables which might come back to bite you later in development.
Awesome. Thanks a lot for the explinations. It will be a number of nested loops then. It’s a bit counterintuitive that loops are often the fasted option. Combined with parallel threading, it will be difficult to beat.
Well, if you think about it, vectorized code in R or numpy is simply calling loops written in C. Here, the language is compiled, and therefore writing the loops (as in C) is usually the fastest : the compiler knows loops very well, while vectorized operations are overhead for him (Yes, I personalize my compiler. He cares about me.)
Moreover, take a look at the LoopVectorisation.jl package, which makes through a simple macro your loops… Vectorized ! But not exactly the same kind of vectorization, a kind that sweeter to your CPU, and therefore to performances
Finally, you should time your code using BenchmarkTools.jl if your are really after the last drop of juice
With reference to the original question, global iter can be avoided using a let block:
using Optim
let iter = 1
global function optfnc(x::Vector{Float64})
y = x[1].^2 + x[1].*2
println(iter)
iter += 1
return y
end
end
x0 = [1.0]
optimize(x -> optfnc(x), x0, BFGS())
or a closure:
using Optim
function foo(x0::Vector{Float64})
iter = 1
function optfnc(x::Vector{Float64})
y = x[1].^2 + x[1].*2
println(iter)
iter += 1
return y
end
optimize(x -> optfnc(x), x0, BFGS())
end
foo([1.0])
Thanks a lot for the advice. It is just counterintuitive because I come from interpreted languages where loops are always bad. Loops are great in my opinion if they are fast. Excatly because of the points you mentioned: often easy to understand and clear to write.
The LoopVectorisation package looks interesting. I’ll have a look at it. Do you know how it compares to multithreading? E.g. using this for allocating vectors (like here) is often faster but larger tasks are better done in multithreading?
Makes sense. I think this is an important issue that should be clarified by the core developers at some point. Combined with the name it gets confusing
Loopvectorization.jl supports both simd vectorization and multi-threading simultaneously. You can use both techniques at the same time and get the combined speedup.