Optimization on Stiefel manifold with auto-differentiation

Not quite, as explained here: Zygote.jl: How to get the gradient of sparse matrix - #6 by stevengj — if you are going to write an rrule, it typically has to be for the function that includes both the sparse-matrix construction and how you use the sparse matrix. (This isn’t usually so hard, though.)

Where are the sparse matrices in your problem description above?

(You might also try Enzyme.jl.)