Avoid repeated memory allocation by defining an Array{Float64} outside of optimize() using Optim.jl, ForwardDiff.jl

Maximilian · April 10, 2020, 5:24pm

I am using Optim.jl with an objective that is computed by multipliying, transposing and inverting some matrices (maximum size of about 500x500). However, I have only a few parameters (maximum 200), each one having a fixed position in those matrices.

Until now, I used functions to define the matrices every time the objective function is called, but this leads to huge performance drawbacks as memory has to be allocated every time. Therefore, I would like to define the matrices in a struct outside of the objective function and only change the entries to the values of the parameters each time the objective is called.

This leads to the problem that the matrices are of Type Array{Float64}, while the parameters are Dual Numbers.

This mwe reproduces the Error:

using Optim, ForwardDiff

mutable struct MyStruct{T}
    a::T
end

A = [0.0 0
    0 0]

teststruct = MyStruct(A)

function prepare_struct(par, mystruct)
    mystruct.a[2,2] = par[1]
    return mystruct
end

function objective(mystruct)
    mystruct.a[2,2]^2
end

function wrap(par, mystruct)
    mystruct = prepare_struct(par, mystruct)
    return objective(mystruct)
end

optimize(par -> wrap(par, teststruct),
            [5.0],
            LBFGS(),
            autodiff = :forward)

ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#11#12",Float64},Float64,1})

One possible remedy would be converting the matrices before changing the parameters:

function myconvert(par, mystruct)
    T = eltype(par)
    new_array = convert(Array{T}, mystruct.a)
    mystruct_conv = MyStruct(new_array)
    return mystruct_conv
end

function new_wrap(par, mystruct)
    mystruct = myconvert(par, mystruct)
    mystruct = prepare_struct(par, mystruct)
    return objective(mystruct)
end

optimize(par -> new_wrap(par, teststruct),
            [5.0],
            LBFGS(),
            autodiff = :forward)

Is converting a good/the most performant solution in this case?

ChrisRackauckas · April 10, 2020, 5:54pm

For a fix see https://docs.sciml.ai/latest/basics/faq/#I-get-Dual-number-errors-when-I-solve-my-ODE-with-Rosenbrock-or-SDIRK-methods-1

DNF · April 11, 2020, 7:01pm

Note that the wrapper struct doesn’t have to be mutable as long as the wrapped array itself is mutable. In this case you can just make it an ordinary struct.

On the other hand, why do you need to wrap the array in anything at all?

Maximilian · April 11, 2020, 10:58pm

Thanks for your answer, this seems like what I was looking for, but I don’t quite understand how it works and how to use it. I also could not find more documentation about it…

Could you maybe post a code snippet about how to get this into my example? Or can I somewhere read more about it? I would post my erroneous attempts to use it but I think that wouldn’t help anyone…

Maximilian · April 11, 2020, 11:05pm

You are completely right, I just was not thinking about mutable vs. not mutable when writing the example, I think for the question it does not really matter.

Also as you say there is no computational reason to store the array in a struct. However, in my real code I have a few arrays and I think most people using my code would think of those arrays as belonging together as they define a statistical model. So it is just a matter of convenience. But maybe in the example it is a bit confusing…

Maximilian · April 24, 2020, 12:07pm

Okay, so let me try again. In the example @ChrisRackauckas posted, the pre-allocated array was as big as the array of parameters. However, my pre-allocated array is much bigger, and only some of the entries should be changed to the parameters. I don’t know how to achieve this, since for example

A =  [0 0
      0 3.0]

function wrap(par, matr)
   matr = DiffEqBase.get_tmp(matr, par)
   print(size(matr))
   return par[1]^2
end

optimize(par -> wrap(par, DiffEqBase.dualcache(A)),
           [5.0],
           LBFGS(),
           autodiff = :forward)

leads to size(matr) = (5,2) and the values in A are not preserved (matr[2,2] should be 3.0).

ChrisRackauckas · April 24, 2020, 2:32pm

The values shouldn’t be preserved. That’s the point of it being a cache. But you can fill/refill at any time.

Maximilian · April 24, 2020, 6:58pm

Okay, so I would have to do something like

matr .= A

in the objective function? This would convert the Float64 entries of A to dual numbers. But I would have to convert A every time the objective is called, which is quite unnecessary since converting it once outside the objective before optimizing the function would be sufficient. But outside, I don’t know to what I should convert it to, Array{ForwardDiff.Dual{Nothing, Float64, 0} does not work, I assume this is because of the tags.

And why is the size of matr so strange?

Topic		Replies	Views
Optim.optimize() uses excessive memory with autodiff? Optimization (Mathematical) optim	11	1686	June 22, 2018
Reduce memory allocation for repeatedly solving problem using Optim.jl Optimization (Mathematical) performance , optim , memory-allocation	0	585	December 23, 2020
Cache results between function calls in Optimization.jl Optimization (Mathematical)	11	684	May 1, 2023
Using preallocated arrays of unknown type Performance	6	168	June 16, 2025
How to cache dual numbers correctly /dual_cache reinterpretation error General Usage question , diffeq	1	826	November 10, 2019

Avoid repeated memory allocation by defining an Array{Float64} outside of optimize() using Optim.jl, ForwardDiff.jl

Related topics