# How to avoid re-computing the same matrix in optimisation/JuMP?

I have a user-defined nonlinear constraint and its analytical gradient. A major effort is to compute the same matrix in both functions. In the problem below, it’s computed twice; in other problems, up to 2p times for a p x p matrix. Could this matrix be re-used instead?

Integer optimisation is required. It’s currently implemented with Pavito, using a workaround in https://github.com/JuliaOpt/Pavito.jl/pull/12; see also Integer optimisation with user's multivariate functions: Possible in Julia?

``````using JuMP, Ipopt, Cbc, LinearAlgebra, Pavito # with the workaround

#mats: p x p x n array of symmetric matrices
function minimise(mats, target:: Float64)
n = size(mats, 3)

function logDet(r...)
M = sum([r[i] * mats[:, :, i] for i = 1:n]) # same M
v = max(det(M), 1e-8) # safeguard against singularity
return  - log(v) - target
end

M = sum([r[i] * mats[:, :, i] for i = 1:n]) # same M
if det(M) < 1e-8 # perturb a singular matrix
M[diagind(M)] = M[diagind(M)] + 1e-8
end
D = inv(M)
for  j = 1: n
g[j] = - tr(D * mats[:, :, j])
end
end

solver = PavitoSolver(
mip_solver_drives=false,
rel_gap=1e-5,
mip_solver=CbcSolver(),
cont_solver=IpoptSolver(),
)
m = Model(solver = solver)
@variable(m,  x[1:n] >= 0, Int, start = 1)

JuMP.register(m, :logDet, n , logDet, logDet_grad)
JuMP.addNLconstraint(m,  :(\$(Expr(:call, :logDet,  x...) ) <=0.0 ) )
@objective(m, Min,  sum(x))
solve(m)
return  getvalue(x)
end

# Example data with 10 matrices:
arr = [ [0.00104058 -0.000268537; -0.000268537 6.93e-5]
[0.00390625 -0.000976563; -0.000976563 0.000244141]
[0.00826446 -0.00200351; -0.00200351 0.000485698]
[0.0357925 -0.00773893; -0.00773893 0.00167328]
[0.0625 -0.0125; -0.0125 0.0025]
[0.16 -0.0256; -0.0256 0.004096]
[0.25 -0.0333333; -0.0333333 0.00444444]
[0.390625 -0.0390625; -0.0390625 0.00390625]
[0.591716 -0.0364133; -0.0364133 0.00224082]
[0.756144 -0.0263006; -0.0263006 0.000914805] ]

target = log(17.57215962979037) # tight, active at solution
# target = log(17.6)  # lax

sol = minimise(reshape(arr', 2,2,10), target)
println(sol)

#Solution: obj = 11;  x = [0, 0, 0, 0, 0, 6, 0, 0, 0, 5] with either target
``````

This question is similar to JuMP User-Defined Function, Value and Gradient At Once. However, in that post, the function’s scalar value is re-used, rather than an intermediate matrix.

Extras: I would also welcome general advice on speeding-up the above code, like the suggestion by odow. Note that the actual p x p x n array usually has p x p matrices with a small p, in `[2, ..., 12]` , whereas their total number, n, can be up to 1,000 - 10,000. Also, the matrices are symmetric (and positive-semidefinite), so that n(n + 1)/2 elements need be added/multiplied rather than n^2.

Many thanks!

1 Like

Have you confirmed that computing `M` is a bottleneck? Computing `det(M)` and `inv(M)` looks expensive.

One thing you can do is make `M` 5x cheaper to compute by using explicit loops:

``````using BenchmarkTools

mats = rand(4, 4, 3)
n = 3
r = rand(3)

function foo(mats, n, r)
return sum([r[i] * mats[:, :, i] for i = 1:n])
end

function bar(mats, n, r)
M = zeros(size(mats, 1), size(mats, 2))
@inbounds for k in 1:n
@inbounds for j in 1:size(mats, 2)
@inbounds for i in 1:size(mats, 1)
M[i, j] += r[k] * mats[i, j, k]
end
end
end
return M
end

``````

Then, benchmarking

``````julia> @benchmark foo(\$mats, \$n, \$r)
BenchmarkTools.Trial:
memory estimate:  1.80 KiB
allocs estimate:  11
--------------
minimum time:     651.712 ns (0.00% GC)
median time:      733.819 ns (0.00% GC)
mean time:        918.549 ns (10.80% GC)
maximum time:     296.685 μs (99.51% GC)
--------------
samples:          10000
evals/sample:     160

julia> @benchmark bar(\$mats, \$n, \$r)
BenchmarkTools.Trial:
memory estimate:  208 bytes
allocs estimate:  1
--------------
minimum time:     128.784 ns (0.00% GC)
median time:      139.478 ns (0.00% GC)
mean time:        161.501 ns (6.74% GC)
maximum time:     53.877 μs (99.61% GC)
--------------
samples:          10000
evals/sample:     883
``````

It that still doesn’t work, then follow the advice in the post you linked to and cache the values of `logDet` and `logDet_grad` for incoming `r`; no need to cache the matrix `M`.

2 Likes

This is a very good speed-up strategy, thank you! I should’ve mentioned that the p x p x n array has p in `[2, ..., 12]`, whereas n can be 1,000 - 10,000; hence my assumptions about the bottlenecks.

However, I do not quite see how caching the values of `logDet` and `logDet_grad` would help much. The latter does compute `det(M)`, which can re-use `- log(v) - target`, but wouldn’t the other computations still be required? Also, wouldn’t it be expensive to check, say, a `1000 x 1` vector for equality to a previous `r` (unless some IDs or hashcodes are compared)?

Therefore, would caching `M` between the calls to `logDet` and `logDet_grad` be more beneficial? Then, how would I go about it? Are there some packages or code examples? Can I do it in JuMP or only in `MathProgBase` directly?

Probably the simplest solution is to define a third function, such as `bar(mats, n, r)` in the answer by odow. This function, returning the summed matrix, would be memoized and called from inside `logDet(r...)` and `logDet_grad(g, r...)`. Oops!