Avoid repeated memory allocation by defining an Array{Float64} outside of optimize() using Optim.jl, ForwardDiff.jl

I am using Optim.jl with an objective that is computed by multipliying, transposing and inverting some matrices (maximum size of about 500x500). However, I have only a few parameters (maximum 200), each one having a fixed position in those matrices.

Until now, I used functions to define the matrices every time the objective function is called, but this leads to huge performance drawbacks as memory has to be allocated every time. Therefore, I would like to define the matrices in a struct outside of the objective function and only change the entries to the values of the parameters each time the objective is called.

This leads to the problem that the matrices are of Type Array{Float64}, while the parameters are Dual Numbers.

This mwe reproduces the Error:

using Optim, ForwardDiff

mutable struct MyStruct{T}
    a::T
end

A = [0.0 0
    0 0]

teststruct = MyStruct(A)

function prepare_struct(par, mystruct)
    mystruct.a[2,2] = par[1]
    return mystruct
end

function objective(mystruct)
    mystruct.a[2,2]^2
end

function wrap(par, mystruct)
    mystruct = prepare_struct(par, mystruct)
    return objective(mystruct)
end

optimize(par -> wrap(par, teststruct),
            [5.0],
            LBFGS(),
            autodiff = :forward)

ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#11#12",Float64},Float64,1})

One possible remedy would be converting the matrices before changing the parameters:

function myconvert(par, mystruct)
    T = eltype(par)
    new_array = convert(Array{T}, mystruct.a)
    mystruct_conv = MyStruct(new_array)
    return mystruct_conv
end

function new_wrap(par, mystruct)
    mystruct = myconvert(par, mystruct)
    mystruct = prepare_struct(par, mystruct)
    return objective(mystruct)
end

optimize(par -> new_wrap(par, teststruct),
            [5.0],
            LBFGS(),
            autodiff = :forward)

Is converting a good/the most performant solution in this case?

For a fix see https://docs.sciml.ai/latest/basics/faq/#I-get-Dual-number-errors-when-I-solve-my-ODE-with-Rosenbrock-or-SDIRK-methods-1

Note that the wrapper struct doesn’t have to be mutable as long as the wrapped array itself is mutable. In this case you can just make it an ordinary struct.

On the other hand, why do you need to wrap the array in anything at all?

Thanks for your answer, this seems like what I was looking for, but I don’t quite understand how it works and how to use it. I also could not find more documentation about it…

Could you maybe post a code snippet about how to get this into my example? Or can I somewhere read more about it? I would post my erroneous attempts to use it but I think that wouldn’t help anyone…

You are completely right, I just was not thinking about mutable vs. not mutable when writing the example, I think for the question it does not really matter.

Also as you say there is no computational reason to store the array in a struct. However, in my real code I have a few arrays and I think most people using my code would think of those arrays as belonging together as they define a statistical model. So it is just a matter of convenience. But maybe in the example it is a bit confusing…

Okay, so let me try again. In the example @ChrisRackauckas posted, the pre-allocated array was as big as the array of parameters. However, my pre-allocated array is much bigger, and only some of the entries should be changed to the parameters. I don’t know how to achieve this, since for example

A =  [0 0
      0 3.0]

function wrap(par, matr)
   matr = DiffEqBase.get_tmp(matr, par)
   print(size(matr))
   return par[1]^2
end

optimize(par -> wrap(par, DiffEqBase.dualcache(A)),
           [5.0],
           LBFGS(),
           autodiff = :forward)

leads to size(matr) = (5,2) and the values in A are not preserved (matr[2,2] should be 3.0).

The values shouldn’t be preserved. That’s the point of it being a cache. But you can fill/refill at any time.

Okay, so I would have to do something like

matr .= A

in the objective function? This would convert the Float64 entries of A to dual numbers. But I would have to convert A every time the objective is called, which is quite unnecessary since converting it once outside the objective before optimizing the function would be sufficient. But outside, I don’t know to what I should convert it to, Array{ForwardDiff.Dual{Nothing, Float64, 0} does not work, I assume this is because of the tags.

And why is the size of matr so strange?