ANN: Announcing Unitary.jl, a differentiable parametrisation of the group of Unitary Matrices

Tomas_Pevny · October 7, 2020, 8:01am

For our SumProductTransformation networks https://arxiv.org/abs/2005.01297, we have created an invertible “Dense” transformations usual in neural networks. Our version features efficient inversion and efficient calculation of a determinant of a Jacobian (of course only where this operation makes sense, i.e. it is from a R^d → R^d). Our approach (detailed in the paper) relies on a representation and optimization of a dense in an SVD decomposed form, for which we needed differentiable parametrisation of the group of Unitary matrices. We have separated this functionality to a separate repo, which is now registered and you can freely use it with Flux / Zygote. https://github.com/pevnak/Unitary.jl.

For implementation of Dense layer, see https://github.com/pevnak/SumProductTransform.jl/blob/master/src/layers/svddense.jl, whichi roughly implements Bijectors.jl interface.

I would be happy if someone founds a use of this. We are currently working on a GPU version, but it will takes a bit of time. Meanwhile, reach me for possible enhancements or questions.

Tomas

tim.holy · October 7, 2020, 12:13pm

This is great! I was just starting to do almost exactly Unitary.jl for a project right now, perhaps I won’t have to.

antoine-levitt · October 7, 2020, 1:00pm

Cool package! I know absolutely nothing about the ML context, but just some remarks:

It’s pretty confusing to call these unitaries if you’re only doing real matrices. Why don’t you call them orthogonal?
You can get other parametrizations by using exponential mappings. They should probably be better in the sense that they deform the metric less.
If I understand correctly it appears you just want to optimize over orthogonal matrices. Optim.jl and Manopt.jl support this. But if I understand correctly your approach is only quadratic scaling, while the Riemannian optimization methods are usually cubic. I guess that’s the main point? CC the manifold people @kellertuer @mateuszbaran

tim.holy · October 7, 2020, 2:21pm

You can get other parametrizations by using exponential mappings.

I’m not sure it’s different from what they are doing. The Givens rotation is exp(A) where A is skew-symmetric with only a single nonzero upper-triangular entry, i.e., a basis for SO(n).

antoine-levitt · October 7, 2020, 2:30pm

Right, I was thinking in the Riemannian optimization setting where you take the exponential (or some other exponential-like mapping) of a full skew symmetric matrix (the gradient of the objective function projected onto the tangent plane)

mateuszbaran · October 7, 2020, 5:46pm

That’s interesting. I guess this approach should be faster than something similar to typical manifold gradient-based optimization? That is using the standard matrix representation of orthogonal matrices, projecting Euclidean gradient to Riemannian gradient and applying a retraction. Though I don’t know, maybe using QR retraction instead of exact exp would be faster and accurate enough here.

Would it make sense to try using matrix in QR decomposed form instead of SVD?

Tomas_Pevny · October 7, 2020, 6:09pm

Actually, we have used QR as well, but was experimentally better in our application. I believe that it was partially caused by the fact that when angle in givens rotation is zero, it represents a diagonal matrix, which is nice in machine learning.

Tomas_Pevny · October 7, 2020, 6:11pm

For a Unitary (or Orthogonal) matrix of dimension d, we use precisely d^(d-1)/2 parameters. The multiplication of matrix by this vector requires 4 times more multiplications and two times more additions (besides sin and cos).

I am sorry for confusing name. I guess I cannot change it, since the package is registered.

Topic		Replies	Views
[ANN] EquivariantOperators.jl: Rotation equivariant machine learning and finite differences on scalar and vector fields Package Announcements package , announcement , diffeq , flux , numerics	2	983	August 14, 2022
ANN: Support for optimization with complex variables and coefficients in Convex.jl Community package , announcement	0	694	March 19, 2017
[ANN] OperatorLearning.jl: Functional mappings to solve parametric PDEs Package Announcements package , flux , sciml	2	679	February 9, 2022
Optim.jl v0.9.0 is out! Community release , optim	16	2423	June 28, 2017
Optim.jl with autodiff and complex numbers Optimization (Mathematical) question , optim , complex-numbers	8	1488	August 18, 2019

ANN: Announcing Unitary.jl, a differentiable parametrisation of the group of Unitary Matrices

Related topics