I am trying to solve a convex flow-assignment problem on a large(ish) graph with (currently) about 7000 edges. Mathematically, it looks like
where x is a vector containing the flows to optimize. l and u contain constant factors detailing the length of each edge (positive) and utility rate (utility per distance, strictly negative). A is the incidence matrix of my graph, and b is a vector which is -1 at the index of the start node, 1 at the index of the destination node and zero otherwise.
F is given by
(see this paper for the full details)
As a start, I tried to implement the problem using Optimizations.jl
, which looks like this:
using Optimization, OptimizationOptimJL
using SparseArrays
using ReverseDiff
using Graphs
# objective and constraint
f(x) = ((1 + x) * log(1 + x)) - x
U(x, p) = -p[1]' * (p[2] .* x) + p[1]' * f.(x)
flow_cons(res, x, p) = res .= p[3] * x
# create specific problem
function setup_problem(l, u, A, src, dst)
num_verts = size(A, 1)
num_edges = size(A, 2)
x0 = fill(0.1, num_edges)
b = spzeros(num_verts)
b[src] = -1.0
b[dst] = 1.0
optprob = OptimizationFunction(
U,
AutoReverseDiff(compile=false),
cons=flow_cons
)
prob = OptimizationProblem(
optprob, x0, [l, u, A],
lcons=b, ucons=b,
lb=zeros(num_edges), ub=fill(Inf, num_edges)
)
end
If I then try to solve it with the simple toy graph given by
function toy_flow_graph()
l = [2.0, 1, 1, 1, 1, 2]
u = [-1.0, -1, -1, -1, -1, -2]
A = spzeros(3, 6)
for (i, e) in enumerate([(1, 3), (1, 2), (2, 3), (2, 3), (2, 1), (1, 3)])
A[e[1], i] = -1
A[e[2], i] = 1
end
return l, u, A
end
toy_problem = setup_problem(toy_graph()..., 1, 3)
toy_solution = solve(toy_problem, IPNewton())
I get the expected results as given in the paper, which is great, but when I try to solve the same problem on even slightly larger (not even close to the size of graph I want to get to) graphs, this approach slows down considerably:
function grid_flow_graph(dims)
g = SimpleDiGraph(Graphs.grid(dims))
l = rand(ne(g))
u = fill(-1.0, ne(g))
A = incidence_matrix(g)
return l, u, A
end
grid_problem = setup_problem(grid_flow_graph([6, 6])..., 1, 2)
grid_solution = solve(grid_problem, IPNewton()) # takes about 5 seconds
Which takes about 5 to 6 seconds on my computer (mac book air m1).
From what I can see profiling the solve
call, I would guess that most of the time is spent setting up the jacobian and the hessian using ReverseDiff
. Is the objective function U
setup in a weird way such that this operation scales badly? Or is ReverseDiff
not able to handle a function which takes a vector of 120 inputs? (I would assume that there are much larger problems run through it than this one…)
Is Optimization.jl
just not the right choice for this kind of problem? Should I look at JuMP
or something completely different? Since the problem is convex, maybe Convex.jl
might work here as well.
I am somewhat stuck at figuring out how I can get this to be fast enough so that I can actually run my real graphs through it. Any pointers on what I am doing wrong here are greatly appreciated.