I don’t think projection will work for MMA without generalising the algorithm in some way. May be possible but it’s not trivial.
Can we use this package to optimize over parameters which are organized in structs (instead of having all parameters arranged linearly in a vector or an array)?
Given that this uses Zygote, I am thinking it might not be so difficult to make it work with more general parameter structures, just like Flux.jl
does (via the handy params
construct).
What are your thoughts on this? From the README it seems at the moment Nonconvex.jl only handles having all parameters in a single array, am I correct?
For now it’s only vectors. I do plan to have conversion functions between vectors and other data structures. Supporting non-vector data structures natively may be possible for pure Julia solvers but not for the other ones. The conversion function approach is more general and can be used to support sparse matrices, complex numbers, etc.
Can you please provide an example of how one would one use ForwardDiff
instead of Zygote
for a specific objective function with this package?
I added an example to the README https://github.com/mohamed82008/Nonconvex.jl#hack-to-use-other-automatic-differentiation-backends. @oxinabox may not approve though.
I approve of this, it looks like the right way to do it to me.
If you do it to a function you don’t own it is type-piracy.
But I mean its probably not the wrong thing to do.
Since if you want it to be evaluated in forward mode for optimization purposes you porbably in general want it to be evaluated in forward mode even if used as part of a larger system that a reverse-mode AD is doing.
And now your AD is mixed-mode.
You might be interested in ParameterHandling.jl which has doing this as it’s main purpose via flatten
It’s @willtebbutt’s work.
Similar code also exists in FiniteDifferences.jl as to_vec
.
though we would like to remove it and replace it with ChainRule’s differnetial types which naturally can perform addition with arbitary data-types without having to push it down to a vectors and back-up.
But you probably can’t do that isn’t IPOPT probably demands a Vector, right?
You might be interested in ParameterHandling.jl which has doing this as it’s main purpose via
flatten
Awesome! I will look into it, thanks.
Correct.
Thanks. This is fine as a workaround, but since many packages implement some kind of AD shim that takes a real-valued function and provides a gradient (eg GalacticOptim, LogDensityProblems, many more), it would be great to arrive at a single, lightweight package that does this for all AD frameworks.
That was the point of https://github.com/JuliaDiff/AbstractDifferentiation.jl. I need to finish that PR though.
Update - Release v0.5.1
Mixed integer optimization is now available in Nonconvex.jl using Juniper and Ipopt. To my knowledge, this is the first mixed integer nonlinear programming package in Julia that uses Zygote for automatic differentiation and doesn’t require an explicit JuMP model definition. Happy optimization!
How would I modify that for the Jacobian (of the constraint)?
function ChainRulesCore.rrule(::typeof(constraint), x::AbstractVector)
val = constraint(x)
jac = ForwardDiff.jacobian(constraint, x)
val, Δ -> (NO_FIELDS, Δ * jac)
end
does not work (DimensionMismatch
).
Δ * jac
→ jac' * Δ
Update - Release v1.0
I released Nonconvex 1.0 and started the JuliaNonconvex organisation. In the 1.0 release, Nonconvex.jl is just a shell package that can be used to load individual solver/wrapper packages using the Nonconvex.@load
macro. Individual solvers and wrappers live in their own packages in the JuliaNonconvex org. The documentation also lives in the Nonconvex.jl repo. There are plenty of features that I haven’t announced yet on this forum which are available on 1.0:
- Arbitrary decision variable types are now supported. This includes tuples, named tuples, arrays of arrays, dictionaries, structs, etc. These can also be nested together or put in a heterogeneous vector.
- A multi-start version of all the algorithms is now available making using of Hyperopt.jl to re-start the optimisation at different initial solutions using various strategies.
- A JuMP model can now be converted to a Nonconvex model. So you can start from a JuMP model, define all your variables and linear constraints there and then change that model to a Nonconvex model to define the nonlinear functions, make use of Zygote and/or make use of the optimisation algorithms available in Nonconvex.jl.
- An experimental constrained surrogate-assisted optimisation meta-algorithm is now available. It can replace expensive functions by Gaussian process surrogates and solve the surrogate model using any of the other optimisation algorithms in Nonconvex.jl (hence the meta part). It’s experimental because it needs more fine tuning and experimenting. Inequality and equality constraints are supported.
- The augmented Lagrangian algorithm in Nonconvex (wrapped from Percival.jl) now handles arbitrary precision number types correctly.
- Thanks to @noil-reed 's GSoC project, Nonconvex.jl now has nonconvex semidefinite constraint support via an interior point meta-algorithm that can turn any nonlinear programming solver to a nonlinear semidefinite programming solver by iteratively solving the barrier problem. This means that you can constrain any (non-convex) matrix-valued function of the decision variables to be positive semidefinite. This is a recent work and the documentation is underway but you can find an example here.
- @noil-reed also implemented some zero-order local search algorithms for bound constrained problems that don’t require gradients. See the multi trajectory search algorithm in the documentation for more details.
Happy optimisation and as always feel free to test the package, open issues, start discussions, and/or of course reach out to contribute!