How to avoid re-computing the same matrix in optimisation/JuMP?

Olegg · November 22, 2018, 8:13pm

I have a user-defined nonlinear constraint and its analytical gradient. A major effort is to compute the same matrix in both functions. In the problem below, it’s computed twice; in other problems, up to 2p times for a p x p matrix. Could this matrix be re-used instead?

Integer optimisation is required. It’s currently implemented with Pavito, using a workaround in Disable Hessians when not provided by JuMP by spockoyno · Pull Request #12 · jump-dev/Pavito.jl · GitHub; see also Integer optimisation with user's multivariate functions: Possible in Julia?

using JuMP, Ipopt, Cbc, LinearAlgebra, Pavito # with the workaround 

#mats: p x p x n array of symmetric matrices
function minimise(mats, target:: Float64)
n = size(mats, 3)

function logDet(r...)
        M = sum([r[i] * mats[:, :, i] for i = 1:n]) # same M
        v = max(det(M), 1e-8) # safeguard against singularity
        return  - log(v) - target
end

function logDet_grad(g,  r...)
    M = sum([r[i] * mats[:, :, i] for i = 1:n]) # same M
        if det(M) < 1e-8 # perturb a singular matrix
            M[diagind(M)] = M[diagind(M)] + 1e-8
         end
    D = inv(M)
        for  j = 1: n
            g[j] = - tr(D * mats[:, :, j])
        end
end

solver = PavitoSolver(
    mip_solver_drives=false,
    rel_gap=1e-5,
	mip_solver=CbcSolver(),
	cont_solver=IpoptSolver(),
)
m = Model(solver = solver)
@variable(m,  x[1:n] >= 0, Int, start = 1)

JuMP.register(m, :logDet, n , logDet, logDet_grad)
JuMP.addNLconstraint(m,  :($(Expr(:call, :logDet,  x...) ) <=0.0 ) )
@objective(m, Min,  sum(x))
solve(m)
return  getvalue(x)
end

# Example data with 10 matrices:
arr = [ [0.00104058 -0.000268537; -0.000268537 6.93e-5]
[0.00390625 -0.000976563; -0.000976563 0.000244141]
[0.00826446 -0.00200351; -0.00200351 0.000485698]
[0.0357925 -0.00773893; -0.00773893 0.00167328]
[0.0625 -0.0125; -0.0125 0.0025]
[0.16 -0.0256; -0.0256 0.004096]
[0.25 -0.0333333; -0.0333333 0.00444444]
[0.390625 -0.0390625; -0.0390625 0.00390625]
[0.591716 -0.0364133; -0.0364133 0.00224082]
[0.756144 -0.0263006; -0.0263006 0.000914805] ]

target = log(17.57215962979037) # tight, active at solution
# target = log(17.6)  # lax 

sol = minimise(reshape(arr', 2,2,10), target)
println(sol)

#Solution: obj = 11;  x = [0, 0, 0, 0, 0, 6, 0, 0, 0, 5] with either target

This question is similar to JuMP User-Defined Function, Value and Gradient At Once. However, in that post, the function’s scalar value is re-used, rather than an intermediate matrix.

Extras: I would also welcome general advice on speeding-up the above code, like the suggestion by odow. Note that the actual p x p x n array usually has p x p matrices with a small p, in [2, ..., 12] , whereas their total number, n, can be up to 1,000 - 10,000. Also, the matrices are symmetric (and positive-semidefinite), so that n(n + 1)/2 elements need be added/multiplied rather than n^2.

Many thanks!

odow · November 28, 2018, 2:19am

Have you confirmed that computing M is a bottleneck? Computing det(M) and inv(M) looks expensive.

One thing you can do is make M 5x cheaper to compute by using explicit loops:

using BenchmarkTools

mats = rand(4, 4, 3)
n = 3
r = rand(3)

function foo(mats, n, r)
    return sum([r[i] * mats[:, :, i] for i = 1:n])
end

function bar(mats, n, r)
    M = zeros(size(mats, 1), size(mats, 2))
    @inbounds for k in 1:n
        @inbounds for j in 1:size(mats, 2)
            @inbounds for i in 1:size(mats, 1)
                M[i, j] += r[k] * mats[i, j, k]
            end
        end
    end
    return M
end

Then, benchmarking

julia> @benchmark foo($mats, $n, $r)
BenchmarkTools.Trial:
  memory estimate:  1.80 KiB
  allocs estimate:  11
  --------------
  minimum time:     651.712 ns (0.00% GC)
  median time:      733.819 ns (0.00% GC)
  mean time:        918.549 ns (10.80% GC)
  maximum time:     296.685 μs (99.51% GC)
  --------------
  samples:          10000
  evals/sample:     160

julia> @benchmark bar($mats, $n, $r)
BenchmarkTools.Trial:
  memory estimate:  208 bytes
  allocs estimate:  1
  --------------
  minimum time:     128.784 ns (0.00% GC)
  median time:      139.478 ns (0.00% GC)
  mean time:        161.501 ns (6.74% GC)
  maximum time:     53.877 μs (99.61% GC)
  --------------
  samples:          10000
  evals/sample:     883

It that still doesn’t work, then follow the advice in the post you linked to and cache the values of logDet and logDet_grad for incoming r; no need to cache the matrix M.

Olegg · November 28, 2018, 11:50am

This is a very good speed-up strategy, thank you! I should’ve mentioned that the p x p x n array has p in [2, ..., 12], whereas n can be 1,000 - 10,000; hence my assumptions about the bottlenecks.

However, I do not quite see how caching the values of logDet and logDet_grad would help much. The latter does compute det(M), which can re-use - log(v) - target, but wouldn’t the other computations still be required? Also, wouldn’t it be expensive to check, say, a 1000 x 1 vector for equality to a previous r (unless some IDs or hashcodes are compared)?

Therefore, would caching M between the calls to logDet and logDet_grad be more beneficial? Then, how would I go about it? Are there some packages or code examples? Can I do it in JuMP or only in MathProgBase directly?

Olegg · November 29, 2018, 5:34pm

Probably the simplest solution is to define a third function, such as bar(mats, n, r) in the answer by odow. This function, returning the summed matrix, would be memoized and called from inside logDet(r...) and logDet_grad(g, r...). Oops!