OptimKit.jl v0.1.0 was just registered overnight. It is a package for gradient based optimization, and currently supports gradient descent, conjugate gradient and LBFGS.
Why did I create another package for this, rather than contributing to alternatives that are already around? Similar to how KrylovKit.jl relates to IterativeSolvers.jl, I did not find the existing packages like Optim.jl (which is otherwise great) sufficiently flexible for my use cases. In particular, I dislike the assumption that parameters should be captured in some AbstractVector
or AbstractArray
subtypes (especially for optimization problems where this assumption is even mathematically too restricted, as they can be part of any kind of manifold). Julia allows for extremely generic programming, but unfortunately this is not always reflected in the corresponding packages, I assume because of baked-in tradition from how these methods (optimization methods, iterative methods, …) are implemented and provided in conventional languages.
In particular, OptimKit.jl does not assume that your parameters are some subtype of AbstractArray
, it does not assume that your gradients are some subtype of AbstractArray
, and it does not assume that everything takes place in Euclidean space. It implements generalized versions of gradient descent, conjugate gradient or LBFGS is are being investigated and formulated in the context of Riemannian optimization.
Is your objective function and gradient being computed by a function fg(x)
, where your parameters x
are a simple vector, that’s great, just do
using OptimKit
optimize(fg, x0, LBFGS())
and your good to go; results should be similar to Optim.jl.
Do you have a complicated model parameterized by a bunch of different variables of different types a, b, c, … which you do not want to wrap in a long vector (note that the parameters can live on a manifold and thus do not even need to exhibit the properties of a vector)? Do you update these parameters in a given tangent direction according to a specific recipe? Is your corresponding gradient direction encoded by one of more objects of custom types? Do you want to specify a specific inner product for these gradients? That’s all great. Just do
x0 = (a0, b0, c0,...)
optimize(fg, x0, algorithm;
retract = # some method that tells you how to update the parameters in a given direction,
inner = # some method that computes the inner product between two tangent directions at the given point x,
add! = # some method that implements the linear combination of tangent directions
transport! = # some method that transports tangent directions
precondition = # some preconditioner)
The choice for making all of these methods keyword arguments is to easily experiment with different choices. Maybe your parameters x
are a simple Vector
, and so are your gradients and tangent directions, but you nonetheless want to use a different inner product. Or your parameters are unitary matrices, and you want to experiment with different retraction schemes. No need to redefine methods, or implement different wrapper types with altered method definitions. Just pass along the relevant method via the keyword arguments.
See the README for more info.