Is Julia Optimization fit for me?

Hello, I am new to Julia but not to optimization. I have implemented a large direct collocation constrained optimization NLP problem in MATLAB, it works but is a bit slow for my taste (takes about 5 minutes on average). I am trying to implement the same NLP problem in Julia using Optimization + Optim + ForwardDiff packages (IPNewton is the solver I use, since this is a constrained optimization problem, plus my constraints are non-symbolic, user-defined and nonlinear). My question here is, is Julia right for me? I seemed to have a successful implementation syntactically and semantically in Julia, but the optimizer runs forever. To let you know of the size of the problem, I have 1800 x 1 input vector X, and 1686 x 1 nonlinear equality plus inequality constraint vector. The program often runs out of memory trying to compute the hessian tensor or if doesn’t, it runs forever. Here is a snippet of optimization code:

function optimizer(X0, params, lb, ub)
    objective = OptimizationFunction(cost, Optimization.AutoForwardDiff(1), cons = constraints)
    # objective = OptimizationFunction(cost, Optimization.AutoFiniteDiff(), cons = constraints)
    collocation_params = create_common_parameters_for_collocation()
    merge!(params, collocation_params)
    lcons, ucons = get_constraint_bounds(params)
    problem = OptimizationProblem(objective, X0, params, lb = lb, ub = ub, lcons = lcons, ucons = ucons, sense = Optimization.MinSense)

    @time sol = solve(problem, IPNewton(), maxiters = 10, show_trace = true, extended_trace = true, show_every = 1, maxtime = 60)

    return sol

Note that solver is either not finishing even one iteration or it’s just ignoring the options I specified to provide an output trace.

I greatly appreciate any direction from seasoned Julia engineers on this. I really had high hopes for reducing the average runtime for my NLP problem in Julia, but this is a completely counter-intuitive result that I am seeing.


If you are dealing with constrained nonlinear optimization, JuMP might be a good tool for you. It allows you to model the optimization problem and pass it to an optimization solver like Ipopt.

You can check out some examples here: Simple examples · JuMP

Does JuMP allow user defined constraints and cost functions? Last I checked, JuMP does not handle that. What I mean by user defined constraints is that the constraints are not defined by analytical equations, as shown in the JuMP examples.


As far as I know, you can use user-defined functions: Nonlinear Modeling · JuMP

I don’t have much experience with these myself, so someone else might have better insights here.

1 Like

You can always just add println statements in your cost function to check this. In general, you want to make sure your cost/constraint functions and their gradients are correct and efficient before worrying about the optimization algorithm.

One thing that jumps out at me is that if you have 1800 inputs, you should not be using forward-mode AD (ForwardDiff) — this will be about 2000x slower to get the gradient than evaluating your cost function! When you have lots of inputs and only one or a few scalar outputs, you should always use reverse-mode AD (or manual adjoint methods), which allow you to compute the gradient with about the same time as computing the cost function once. Even if you are using AD, it’s really crucial to have some idea of how differentiation works in large-scale problems; see also my course notes on reverse/adjoint/backpropagation derivatives as well as many other sources online on this stuff.

There are several packages to choose from that provide nonlinear optimization with arbitrary cost functions and constraints in Julia, including NLopt.jl and Nonconvex.jl.


Beside the above great comments, also be mindful of the order of the optimization algorithm you are using. Second order algorithms like IPNewton require the Hessian of the Lagrangian function which can take a long time if you have many variables and constraints. Consider using the first order augmented Lagrangian algorithm from Percival.jl (available through Nonconvex.jl as well) which only uses the gradient and approximates the Hessian using the l-BFGS method. IPOPT is another great choice which has a first order mode and is also available in Nonconvex.


Thanks everyone for your responses. I have now implemented a rudimentary trajectory optimization algorithm using Nonconvex.jl with IPOPT backend. I realized that the constraints in my problem are independent of each other, in other words, they can be computed PARALLELLY, if Nonconvex.jl allows it. Does anyone know if Nonconvex.jl or IPOPT itself supports this feature? The reason I am asking this is the algorithm currently takes upwards of 10 minutes to run and I am trying to cut down the run time to less than 2 minutes.

1 Like