See Optimization on Stiefel manifold with auto-differentiation — two options:
- Use something like Manopt.jl that knows how to optimize over O_n, and it looks like it supports a Cartesian product of manifolds to include x \in \mathbb{R}^k as well.
- Use a (differentiable) change of variables via the polar decomposition to unconstrained matrices X, which has the advantage of letting you use any ordinary optimization algorithm:
\min_{(X,x)\in \mathbb{R}^{n \times n} \times \mathbb{R}^k} f(X(X^T X)^{-1/2}, x)