Solving large convex optimization/flow assignment problem

I am trying to solve a convex flow-assignment problem on a large(ish) graph with (currently) about 7000 edges. Mathematically, it looks like

\underset{x\in\mathbb{R}_+^{|E|}}{\text{minimize}} \quad U(x) \\ U(x) = -l^T(u \,\circ\,x) + l^T F.(x) \\ \text{such that} \quad Ax=b

where x is a vector containing the flows to optimize. l and u contain constant factors detailing the length of each edge (positive) and utility rate (utility per distance, strictly negative). A is the incidence matrix of my graph, and b is a vector which is -1 at the index of the start node, 1 at the index of the destination node and zero otherwise.

F is given by

F(x) = (1+x)\cdot\ln(1+x)-x

(see this paper for the full details)

As a start, I tried to implement the problem using Optimizations.jl, which looks like this:

using Optimization, OptimizationOptimJL
using SparseArrays
using ReverseDiff
using Graphs

# objective and constraint
f(x) = ((1 + x) * log(1 + x)) - x
U(x, p) = -p[1]' * (p[2] .* x) + p[1]' * f.(x)
flow_cons(res, x, p) = res .= p[3] * x

# create specific problem
function setup_problem(l, u, A, src, dst)
    num_verts = size(A, 1)
    num_edges = size(A, 2)
    x0 = fill(0.1, num_edges)

    b = spzeros(num_verts)
    b[src] = -1.0
    b[dst] = 1.0

    optprob = OptimizationFunction(
        U,
        AutoReverseDiff(compile=false),
        cons=flow_cons
    )
    prob = OptimizationProblem(
        optprob, x0, [l, u, A],
        lcons=b, ucons=b,
        lb=zeros(num_edges), ub=fill(Inf, num_edges)
    )
end

If I then try to solve it with the simple toy graph given by

function toy_flow_graph()
    l = [2.0, 1, 1, 1, 1, 2]
    u = [-1.0, -1, -1, -1, -1, -2]
    A = spzeros(3, 6)
    for (i, e) in enumerate([(1, 3), (1, 2), (2, 3), (2, 3), (2, 1), (1, 3)])
        A[e[1], i] = -1
        A[e[2], i] = 1
    end
    return l, u, A
end

toy_problem = setup_problem(toy_graph()..., 1, 3)
toy_solution = solve(toy_problem, IPNewton())

I get the expected results as given in the paper, which is great, but when I try to solve the same problem on even slightly larger (not even close to the size of graph I want to get to) graphs, this approach slows down considerably:

function grid_flow_graph(dims)
    g = SimpleDiGraph(Graphs.grid(dims))
    l = rand(ne(g))
    u = fill(-1.0, ne(g))
    A = incidence_matrix(g)
    return l, u, A
end

grid_problem = setup_problem(grid_flow_graph([6, 6])..., 1, 2)
grid_solution = solve(grid_problem, IPNewton())  # takes about 5 seconds

Which takes about 5 to 6 seconds on my computer (mac book air m1).

From what I can see profiling the solve call, I would guess that most of the time is spent setting up the jacobian and the hessian using ReverseDiff. Is the objective function U setup in a weird way such that this operation scales badly? Or is ReverseDiff not able to handle a function which takes a vector of 120 inputs? (I would assume that there are much larger problems run through it than this one…)

Is Optimization.jl just not the right choice for this kind of problem? Should I look at JuMP or something completely different? Since the problem is convex, maybe Convex.jl might work here as well.

I am somewhat stuck at figuring out how I can get this to be fast enough so that I can actually run my real graphs through it. Any pointers on what I am doing wrong here are greatly appreciated.

Without really looking carefully, did you try another solver? One that does not use the exact hessian?

Providing exact derivatives also seems possible without changing the library.

For this type of problem, the linear version would be super fast (basically a flow algorithm), so perhaps Frank-Wolfe might make sense.

Hi @SuperGrobi, here’s the JuMP code:

julia> using Graphs

julia> using JuMP

julia> using Ipopt

julia> 

julia> function solve_jump(l, u, A, src, dst)
           num_verts, num_edges = size(A)
           b = zeros(num_verts)
           b[src] = -1.0
           b[dst] = 1.0
           F(x) = (1 + x) * log(1 + x) - x
           model = Model(Ipopt.Optimizer)
           set_silent(model)
           @variable(model, x[1:num_edges] >= 0, start = 0.1)
           @constraint(model, A * x .== b)
           @objective(model, Min, -l' * (u .* x) + l' * F.(x))
           optimize!(model)
           @assert termination_status(model) == LOCALLY_SOLVED
           return value.(x)
       end
solve_jump (generic function with 1 method)

julia> 

julia> function grid_flow_graph(dims)
           g = SimpleDiGraph(Graphs.grid(dims))
           l = rand(ne(g))
           u = fill(-1.0, ne(g))
           A = incidence_matrix(g)
           return l, u, A
       end
grid_flow_graph (generic function with 1 method)

julia> l, u, A = grid_flow_graph([6, 6])
([0.8130872809250147, 0.4671681719332792, 0.9050872185180605, 0.37447369811689646, 0.4339868318121647, 0.21060027380170632, 0.9360661321924352, 0.9300006732959478, 0.10371917532510122, 0.4935427019392431  …  0.15502813567371732, 0.8135556955554263, 0.90531598624137, 0.5999635071375383, 0.5709561809545151, 0.42871423873511194, 0.699746810083748, 0.018612322759939404, 0.1803722771200481, 0.3034136629962212], [-1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0  …  -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0], sparse([1, 2, 1, 7, 1, 2, 2, 3, 2, 8  …  29, 35, 34, 35, 35, 36, 30, 36, 35, 36], [1, 1, 2, 2, 3, 3, 4, 4, 5, 5  …  116, 116, 117, 117, 118, 118, 119, 119, 120, 120], [-1, 1, -1, 1, 1, -1, -1, 1, -1, 1  …  1, -1, 1, -1, -1, 1, 1, -1, 1, -1], 36, 120))

julia> src, dst = 1, 2
(1, 2)

julia> @time solve_jump(l, u, A, src, dst);

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

  1.516486 seconds (2.42 M allocations: 164.183 MiB, 3.15% gc time, 98.68% compilation time: 10% of which was recompilation)

julia> @time solve_jump(l, u, A, src, dst);
  0.016735 seconds (31.25 k allocations: 1.838 MiB)

It easily solves the larger cases:

julia> l, u, A = grid_flow_graph([100, 100]);

julia> src, dst = 1, 2
(1, 2)

julia> @time solve_jump(l, u, A, src, dst);
  2.783527 seconds (9.51 M allocations: 551.793 MiB, 15.77% gc time)

julia> size(A)
(10000, 39600)
5 Likes

I went through the documentation of Optimization.jl. The amount of solvers which support both the cons and box constraints are unfortunately very few or not documented. Only the COBYLA optimizer from PRIMA.jl worked out of the box, but took about 30 seconds on the [6,6] grid graph.

1 Like

Writing up the derivatives would have been the next thing I try. Just wanted to be sure that there is no real julia magic I am trying to use wrongly here… I will also give the Frank-Wolfe a look.

Right now, Optimization.jl does not specialize on convex optimization problems so JuMP is a better bet.

Alright! Thank you very much for the insights. I will give JuMP a go then.

2 Likes