Best practices for NonlinearSolve definition with temporary arrays

Hi all, I am relatively new to Julia. Apologies as it’s a long one but I couldn’t find a solution.

I am using NonlinearSolve.jl to solve a somewhat “complicated” system of equations g(u, p) = 0. The reason the function is “complicated” is that in order to calculate g, the code looks like this:

function g(u, p, 
    tempArray1, ..., tempArrayN, 
    fixedArray1, ..., fixedArrayM)

        <tons of calculations>
        return gvalue

end

In this pseudocode:

  • tempArrayX are arrays that are passed into the function to reduce the number of allocations of the code itself. The arrays are “temporary” (for the lack of a better word) as they are modified in place. Some are vectors, some others are matrices, etc. They all have well specified sizes and types (mostly composed of vectors of Float64s).
  • fixedArrayX are arrays that are passed as auxiliary parameters necessary in the code to compute g. I called them fixed as they are not modified inside g or any other function, and they could be passed using this syntax or within a struct using Parameters.jl and unpacking.

There are a large number of temp and fixed arrays, think of more than 30 of each. They also have different names (as opposed to these very simple ones).

Let me provide a rather silly example:

function g(u, p, temp_values, fixed_displacement, fixed_power)

    fill!(temp_values, zero(eltype(temp_values)))
    temp_values .+= u.^ fixed_power 
    vals = temp_values - fixed_displacement .- p
    return vals

end

# PARAMETERS USED
n_u = 5
fixed_power = 2.0
temp_values = zeros(n_u)
fixed_displacement = [3.0, 4.0, 5.0, 6.0, 7.0]

My question is how am I supposed to call this function within the NonlinearSolve.solve()? If we call the function from our workspace we get that it is type stable:

@code_warntype g(randn(n_u), ones(n_u), temp_values, fixed_displacement, fixed_power)

MethodInstance for g(::Vector{Float64}, ::Vector{Float64}, ::Vector{Float64}, ::Vector{Float64}, ::Float64)
  from g(u, p, temp_values, fixed_displacement, fixed_power) @ Main Untitled-1.ipynb:1
Arguments
  #self#::Core.Const(g)
  u::Vector{Float64}
  p::Vector{Float64}
  temp_values::Vector{Float64}
  fixed_displacement::Vector{Float64}
  fixed_power::Float64
Locals
  vals::Vector{Float64}
Body::Vector{Float64}
1 ─ %1 = Main.:+::Core.Const(+)
│   %2 = Base.broadcasted(Main.:^, u, fixed_power)::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(^), Tuple{Vector{Float64}, Float64}}
│   %3 = Base.broadcasted(%1, temp_values, %2)::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(+), Tuple{Vector{Float64}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(^), Tuple{Vector{Float64}, Float64}}}}
│        Base.materialize!(temp_values, %3)
│   %5 = Main.:-::Core.Const(-)
│   %6 = (temp_values - fixed_displacement)::Vector{Float64}
│   %7 = Base.broadcasted(%5, %6, p)::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(-), Tuple{Vector{Float64}, Vector{Float64}}}
│        (vals = Base.materialize(%7))
└──      return vals

However, to get it into the nonlinear solver, one needs to “shorten” the function. For example, I can use an anonymous function. Something like:

g_short = (u, p) -> g(u, p, temp_values, fixed_displacement, fixed_power)

However, when checking the code, g_short seems to be type unstable:

@code_warntype(g_short(randn(n_u), ones(n_u)))

MethodInstance for (::var"#13#14")(::Vector{Float64}, ::Vector{Float64})
  from (::var"#13#14")(u, p) @ Main Untitled-1.ipynb:8
Arguments
  #self#::Core.Const(var"#13#14"())
  u::Vector{Float64}
  p::Vector{Float64}
Body::Any
1 ─ %1 = Main.g(u, p, Main.temp_values, Main.fixed_displacement, Main.fixed_power)::Any
└──      return %1

If you evaluate g_short within the solver it works fine but my guess is that it is likely using the values from the main workspace, which probably takes longer and creates type instability. I wonder if there is some other way to define these functions (g_short) that is not obvious to me. Here is the solution to the solve() problem for completeness:

u = @MArray[0.1, 0.1, 0.1, 0.1, 0.1]

problem = NonlinearProblem(g_short, u)
sol = solve(problem, FastShortcutNonlinearPolyalg(; autodiff = AutoFiniteDiff()), p = 3.0; maxiters = 50, abstol = 1e-14, show_trace = Val(true))

Bonus question. The end goal is to minimize another function f by choosing p and use g inside of f. In particular, f is such that one calculates it in a way that for each value of p one needs to choose u^* = u(p) such that g(u^*, p) = 0. This procedure then requires nesting the NonlinearSolve.solve() within an Optim.optim(). The problem I describe here then repeats with Optim. How can I do this while keeping the function type stable?

One follow up. I have figured out one could do:

g_short(temp_values, fixed_displacement, fixed_power) = (u, p) -> g(u, p, temp_values, fixed_displacement, fixed_power)

and then to evaluate at (u0, p0) call the function as g_short(temp_values, fixed_displacement, fixed_power)(u0, p0). But if I have way too many temporary arrays (as defined above), is there any prettier way to call the function?

I would say that the best way to to get a “short” version of g would be to put all your temporary and fixed arrays into p. The easiest way to do this is probably by way of a NamedTuple, possibly a nested one. Thus, for example:

temp_arrays = (; temp_1 = temp_1, temp_2 = temp_2) # and so on
fixed_arrays = (; fixed_1 = fixed_1, fixed_2 = fixed_2) # and so on
param_values = (;a = 3.0, b = 7.5) # and so on
p = (; temp_arrays, fixed_arrays, param_values)

and then in your function

function g(u, p)
(; temp_arrays, fixed_arrays, param_values) = p
(;a, b) = param_values
(;temp_1, temp_2) = temp_arrays
(;fixed_1, fixed_2) = fixed_arrays
# [...]
return vals
end