Optimization on unit sphere?

Random question about Differential Evolution: do you happen to know if there’s a variant of DE that keeps all vectors of the population on the unit sphere or on the unit simplex?

My optimization problem is of the form:

\min_x f(x) \quad \text{s.t.} \sum_k x_k^2 = 1

so x should be on the sphere. Thus, I can’t use basic DE because the mutation operation will leave the sphere.

Should I just run the usual DE step (mutation + crossover) and then project the result of the sphere, like in projected gradient descent? I tried it and it kinda works (seems to minimize the objective), but I don’t know if it’s the right way.

I also tried projecting DE’s “donor” points onto the plane tangent to the current x_i, running mutation & crossover on these projections and then projecting/“retracting” the result back onto the sphere. This seems to minimize the objective too, but again I’m not sure if it’s a fluke.

Same with x inside the unit simplex:

\min_x f(x) \quad \text{s.t.} \sum_k x_k = 1, x_k > 0\;\forall k

Are there variants of DE that respect such constraints? I’ve been searching on and off for a couple months, but found literally nothing on this topic.

Sorry, I know it’s off-topic for this thread, but since DE was mentioned, I have to snatch the opportunity

The Manopt.jl package supports optimization on manifolds (such as the unit sphere) and contains several population-based metaheuristic optimizers such as PSO and CMA-ES.

1 Like

A simple trick is to just change variables, to

\min_{y \in \mathbb{R}^n} f(y / \Vert y \Vert_2)

Yeah, this is very simple, it’s what I was using initially. But I know that f(x) is convex and thus “easy” to optimize + the optimum is unique, so I don’t want to lose convexity.

Doesn’t the y/\lVert y\rVert_2 transformation make the function non-convex in y and thus harder to optimize? Perhaps it introduces local minima?

Similarly, I can use the f(\operatorname{softmax}(y)) transformation for the unit simplex case. But then again, I think this makes the resulting function non-convex and potentially arbitrarily complex, since Softmax is non-convex?

A different simple trick is also just to define:

x_n = 1 - \sum_k^{n-1} x_k^2

and then only optimize x_1, \ldots, x_{n-1}, since by definition any choice of those values is on the manifold.

Yet another way is to optimize within the space of stereographic projection.

These helper functions may be of interest.

In practice, we use nonlinear conjugate gradient optimization for some fixed number of “sub” iterations with a fixed stereographic projection axis. The outer loop updates the projection axis as the current manifold point.