Optim: Why am I able to mutate array elements but not Int?

CeterisPartybus · November 8, 2022, 11:53am

I am trying to pass an object inside a function I optimize with Optim.jl, mutate it, and access the result in the next iteration.
While testing, I noticed that I can change elements of an array, defined outside of the function I want to optimize, from inside the function I optimize. However, I cannot do the same with an integer object.
Here is an example:

# The option I want to optimize
function opt_fnc(x,iter)
    y = x[1].^2 + x[1].*2
    
    # Show iteration counter
    println(iter)
    # Add one to iteration counter
    iter += 1
    
    # Return function value
    return y
end

# Initialize iteration counter
iter = 1
# Define starting value
x0 = [1.0]

# Optimize function
optimize(x -> opt_fnc(x,iter), x0, BFGS())

This returns 1 for each iteration. However, if I define iter as a 1x1 array, it stores the change for the next iteration:

# The option I want to optimize
function opt_fnc(x,iter)
    y = x[1].^2 + x[1].*2
    
    # Show iteration counter
    println(iter)
    # Add one to iteration counter
    iter[1] += 1
    
    # Return function value
    return y
end

# Initialize iteration counter
iter = [1]
# Define starting value
x0 = [1.0]

# Optimize function
optimize(x -> opt_fnc(x,iter), x0, BFGS())

This code prints the correct function iteration.

Any ideas why?

Dan · November 8, 2022, 12:05pm

Since iter is passed as a parameter to opt_fun, the reference to iter in iter += 1 is to the parameter and not the global variable. The Int parameter is copied each invocation. The variable iter “shadows” the global variable of the same name.

In the case of the array, the iter array points to the same allocated block of memory and is unchanging during execution (whether copied or shadowed). The changing element is iter[1].

The first Int version would work if the iter parameter to opt_fun is removed, and the function accesses the global variable directly (which it is allowed to). Using global variables is not recommended though. And if done, it is best for performance to define them in a typed manner: iter::Int = 1.

Hope this clarifies things enough.

CeterisPartybus · November 8, 2022, 12:20pm

That makes sense. Thanks!
Now, if iter was not just an iteration counter but, e.g., a large array, I’d have to change all elements in a loop, right? Or is there a better/faster way?
So, I’d have to do this:

# The option I want to optimize
function opt_fnc(x,iter)
    y = x[1].^2 + x[1].*2
    
    # Show current status
    println(iter)
    
    # Add a random number to each element of the large array
    for i in eachindex(iter)
        iter[i] = iter[i] + rand()
    end
    
    # Return function value
    return y
end

# Initialize some large array
iter = [1.0 2.0]
# Define starting value
x0 = [1.0]

# Optimize function
optimize(x -> opt_fnc(x,iter), x0, BFGS())

But that is probably quite slow and something like this would be faster

# The option I want to optimize
function opt_fnc(x,iter)
    y = x[1].^2 + x[1].*2
    
    # Show current status
    println(iter)
    
    # Add a random number to the large array
    iter = iter .+ rand(Float64 ,(2, 1))
    
    # Return function value
    return y
end

# Initialize the large array
iter = [1.0 2.0]
# Define starting value
x0 = [1.0]

# Optimize function
optimize(x -> opt_fnc(x,iter), x0, BFGS())

lrnv · November 8, 2022, 12:35pm

It is not exactly the same thing because you lack a .. The line:

iter = iter .+ rand(Float64 ,(2, 1))

changes completely the iter vector by a new one (and so it allocates, will be slow, and is proabbly not what you want), while :

iter .= iter .+ rand(Float64 ,(2, 1))

is more close in meaning to the loop, but the rand call still allocates as it needs to produce a vector of size 2 and will not participate in loop fuzing. I think that

iter .= iter .+ rand.(Float64)

will do the trick for the full loop fusion and no allocations (i am not sure), but i KNOW that the best chance you have for optimal code is to write the loop yourself:

for i in eachindex(iter)
    iter[i] += rand()
end

Anyway, my rule of thumb is hat using a loop is usually faster in Julia, on the contrary to e.g., R, so if you can write the loop explicitly, you should probably do it. Moreover, loops are easier to reason about

Dan · November 8, 2022, 12:36pm

Yeah, but remember the need to fix the type of a global variables.
Much more useful info about that here:
https://docs.julialang.org/en/v1/manual/performance-tips/#Avoid-untyped-global-variables

Consider also passing a struct into opt_fun with the data inside. This would avoid global variables which might come back to bite you later in development.

CeterisPartybus · November 8, 2022, 5:35pm

Awesome. Thanks a lot for the explinations. It will be a number of nested loops then. It’s a bit counterintuitive that loops are often the fasted option. Combined with parallel threading, it will be difficult to beat.

lrnv · November 8, 2022, 6:51pm

Well, if you think about it, vectorized code in R or numpy is simply calling loops written in C. Here, the language is compiled, and therefore writing the loops (as in C) is usually the fastest : the compiler knows loops very well, while vectorized operations are overhead for him (Yes, I personalize my compiler. He cares about me.)

Moreover, take a look at the LoopVectorisation.jl package, which makes through a simple macro your loops… Vectorized ! But not exactly the same kind of vectorization, a kind that sweeter to your CPU, and therefore to performances

Finally, you should time your code using BenchmarkTools.jl if your are really after the last drop of juice

eckofoid · November 8, 2022, 10:29pm

With reference to the original question, global iter can be avoided using a let block:

using Optim

let iter = 1
      global function optfnc(x::Vector{Float64})
           y = x[1].^2 + x[1].*2
           println(iter)
           iter += 1
           return y
      end
end

x0 = [1.0]
optimize(x -> optfnc(x), x0, BFGS())

or a closure:

using Optim

function foo(x0::Vector{Float64})
      iter = 1
      function optfnc(x::Vector{Float64})
           y = x[1].^2 + x[1].*2
           println(iter)
           iter += 1
           return y
      end
      optimize(x -> optfnc(x), x0, BFGS())
  end

foo([1.0])

Both yield identical output:

1
2
3
4
5
6
7
8
9
 * Status: success

 * Candidate solution
    Final objective value:     -1.000000e+00

 * Found with
    Algorithm:     BFGS

 * Convergence measures
    |x - x'|               = 2.00e+00 ≰ 0.0e+00
    |x - x'|/|x'|          = 2.00e+00 ≰ 0.0e+00
    |f(x) - f(x')|         = 4.00e+00 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 4.00e+00 ≰ 0.0e+00
    |g(x)|                 = 9.17e-12 ≤ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    1
    f(x) calls:    3
    ∇f(x) calls:   3

CeterisPartybus · November 18, 2022, 11:05am

Thanks a lot for the advice. It is just counterintuitive because I come from interpreted languages where loops are always bad. Loops are great in my opinion if they are fast. Excatly because of the points you mentioned: often easy to understand and clear to write.
The LoopVectorisation package looks interesting. I’ll have a look at it. Do you know how it compares to multithreading? E.g. using this for allocating vectors (like here) is often faster but larger tasks are better done in multithreading?

Oh and how do you know JuliaCombiler is a “he”?

lrnv · November 18, 2022, 11:46am

I’m not a pro, you might take a look at it’s documentation.

In my mother tong (french), the neutral is “he”, but if the compiler wants me to call him something else then so be it

CeterisPartybus · November 18, 2022, 11:57am

Makes sense. I think this is an important issue that should be clarified by the core developers at some point. Combined with the name it gets confusing

DNF · November 18, 2022, 12:20pm

Loopvectorization.jl supports both simd vectorization and multi-threading simultaneously. You can use both techniques at the same time and get the combined speedup.

CeterisPartybus · November 18, 2022, 1:33pm

Awesome. Thanks. Any general recommendations when to use which technique? E.g. always use Loopvectorization when possible and if not multithreading?

DNF · November 18, 2022, 5:37pm

What I mean to say is that you can use both at the same time.

The @turbo macro gives you simd and other optimizations (like reordering your loop, unrolling, etc.)

The @tturbo macro does that, plus multithreading.

Topic		Replies	Views
In place assignment of scalars vs vectors Performance	20	1452	June 25, 2021
Speed up Optim.Optimize calls in a for-loop New to Julia	2	98	August 6, 2024
Allocations when moving loop into a function New to Julia memory-allocation	3	130	June 1, 2025
Immutable struct with an Array slower than a constant global Array Performance	14	594	June 23, 2022
Optim.jl single iteration Optimization (Mathematical) question , optim	4	566	March 28, 2022

Optim: Why am I able to mutate array elements but not Int?

Related topics