Update input argument in each Optim.jl iteration

How can I update the input argument such that in each iteration step n>1, the value of b equals the value of b_sol from the n-1th iteration?

I present here the optimization call I am making. I was hoping that b would update each step. However I am finding that b equals b0 at the beginning of each step. Is there a way to make the routine execute what I have in mind?

function log_like(a, b)
   b_sol = contraction_mapper(a, b) # find fixed point b_sol given guess of a and b
   lnl = -sum(log.(prob(a, b_sol)), dims=1); # log likelihood given b_sol and a

   b = b_sol;  # update b guess with contraction mapping from last round
   return lnl
end

Optim.minimizer(optimize(a -> log_like(a , b0), a0, autodiff = :forward))

Hello and welcome to the community!

In general, you have no guarantee in what order the solver will call your objective function. Different line search algorithms may call the objective function several times before accepting a step. Nelder-mead, which is the default optimizer you’re likely using above, also calls the function in a non-intuitive pattern.

You could store the b value in a global variable, but it’s unlikely this is really what you want to do.

Thanks. Conceptually, storing b in a global should improve things because I expect b_sol obtained from the contraction mapping to be closer to the true value than my guess of b which I’ve arbitrarily chosen to be a zero vector. Therefore updating the value of b in a global should reduce the computations each step compared to using b0 each iteration.

However I cannot figure out how to define the global and have it be recognized by the function. The following is what I tried. The issue is that the function does not seem to recognize b defined outside the function and throws an undefined variable error. Can I define the global in a way to make this work?

global b = b0;

function log_like(a)
   b_sol = contraction_mapper(a, b) # find fixed point b_sol given guess of a and b
   lnl = -sum(log.(prob(a, b_sol)), dims=1); # log likelihood given b_sol and a

   b = b_sol;  # update b guess with contraction mapping from last round
   return lnl
end

Optim.minimizer(optimize(log_like, a0, autodiff = :forward))

b is already global in global scope (outside of the function), you need to declare it global inside the function, like so

function log_like(a)
   b_sol = contraction_mapper(a, b) # find fixed point b_sol given guess of a and b
   lnl = -sum(log.(prob(a, b_sol)), dims=1); # log likelihood given b_sol and a

   global b
   b = b_sol;  # update b guess with contraction mapping from last round
   return lnl
end

To make it performant, I would declare

const b = copy(b0)

function log_like(a)
   b_sol = contraction_mapper(a, b) # find fixed point b_sol given guess of a and b
   lnl = -sum(log.(prob(a, b_sol)), dims=1); # log likelihood given b_sol and a

   global b
   b .= b_sol;  # update b guess with contraction mapping from last round
   return lnl
end

i.e., make b into a const variable and then update b in-palce using .=.

Global non-constant variables have big performance impacts in julia, but you can easily use a const global here which have no performance impacts

1 Like

Thank you, that worked!

I anticipate I will need to try to parallelize this process so that the optimizer distributes its search over the parameters space across multiple cpus. Would this be doable for this algorithm?

Yeah that should be doable. You’d need to

  1. Make sure you do not allocate too much, the log.(prob(a, b_sol)) could perhaps be rewritten to not allocate? The GC is single threaded so if you allocate a lot, you are unlikely to see much speedup.
  2. Make sure there is one unique b per thread. You can create a vector of bs like bs = [copy(b0) for _ in 1:Threads.nthreads()] and then access the thread-specific b by bs[Threads.threadid()]
  3. Start julia with -t 4, where 4 is the number of threads you want (or -t auto to select as many threads as the number of cores).

Is there a way to have optim do the parallelization?

Conceptually one approach would be to segment the parameter space for a into N segments, and run N parallel searches over each of these segments. The max likelihoods from each of these N processes can be compared at the end, and the solution defined as the max of these max likelihoods.

I wonder if Optim has an inbuilt way to segment the parameter space and search in parallel in each segment.

Nope, I don’t think so.

I did write some functionality to explore multiple starting points using Optim that was in fact multithreaded

but I never finished it and wouldn’t really recommend using it.

There are other efforts out there that are better packaged, like this one

The naive approach to just starting a number of searches in parallel like you describe will likely not be too hard to pull off though