Identify important subspaces of a function without squaring the conditioning number

mleprovost · October 19, 2020, 11:58pm

Hello,

I would like to find the important directions in the input and output spaces for a \mu-measurable function f: \mathbb{R}^n \xrightarrow{} \mathbb{R}^d.

One way to identify these directions is to look at the eigenvalue decomposition of the integrated inner and outer product of the Jacobian of the function f integrated over the distribution \mu:

C_x = \int (\nabla_x f(x))^\top (\nabla_x f(x)) d\mu(x) \in \mathbb{R}^{n,n} and C_y = \int (\nabla_x f(x)) (\nabla_x f(x))^\top d\mu(x) \in \mathbb{R}^{d,d}

In practice, we can perform a Monte Carlo appproximation of C_x, C_y given M samples x^i \sim \mu:

C_x \approx \frac{1}{M}\sum_{i=1}^M (\nabla_x f(x^i))^\top (\nabla_x f(x^i)) and similarly for C_y

Since C_x is positive definite, an eigendecomposition gives us:

C_x = V \Lambda_x V^T with V\in \mathbb{R}^{n,n}, where V an orthonormal basis for the input space. We can truncate this basis to identify the most important directions based on the decay of energy spectrum of \Lambda_x.

Similarly, we write C_y = U \Lambda_y U^T with U\in \mathbb{R}^{d,d}. U is an orthonormal basis for the output space.

However, this computation is not recommended because it squares the conditioning number. One could perform a SVD of \frac{1}{\sqrt{M}} \left[ \nabla f(x^1), \ldots, \nabla f(x^M) \right] and \frac{1}{\sqrt{M}} \left[ \nabla f(x^1)^\top, \ldots, \nabla f(x^M)^\top \right] and extract the left singulars to get U and V. However assembling these stacked matrices require a lot of storage.

Is there a better way to do this computation without assembling these large matrices? Are you aware of other algorithms that identify these important directions?

mohamed82008 · October 20, 2020, 12:54am

Consider using svdl from IterativeSolvers. As for avoiding forming the matrices, you can technically use a linear operator that defines multiplication and adjoint multiplication but the real question here is how to make these operations more efficient than simply forming the matrices and multiplying. In your case, the matrices have a special meaning. They are the Jacobians which means we can use tricks from automatic differentiation here to possibly improve things.

The operator \nabla f(x=x^1)^T v is essentially the gradient of f(x)^T v wrt x at x = x^1. Often, computing this will be more efficient than computing the full Jacobian. So consider using a reverse-mode AD package such as Zygote to define this operator. Even better, if you can derive a custom “adjoint rule” in ChainRules.jl, you can use this directly.

The second operator you need is \nabla f(x=x^1) v. This is the derivative of the vector valued function f(x(u)) wrt the scalar u where x(u) is specially defined such that x(u) = x^1 and dx/du is the vector v. This is kind of a hack to then use forward-mode automatic differentiation to compute the derivative of f wrt u which will be \nabla f(x=x^1) v. Again, you can define a special frule using ChainRules.jl to specify this operator efficiently or just use ForwardDiff2.jl to compute it for your function.

Let me know if this approach works for you

mleprovost · October 20, 2020, 4:06am

Thank you for your valuable feedback. For now, I can only compute the Jacobian of f with finite differences because ForwardDiff does not support complex differentiation.

I will look for now at svdl

Topic		Replies	Views
Resolve sign ambiguity in SVD and Eigendecomposition General Usage question , linearalgebra	13	1951	May 22, 2021
Automatic differentiation of f(x) constructed by decomposition of a matrix M(x) Optimization (Mathematical) differentiation , flux , machine-learning , numerics	7	2861	January 8, 2020
Efficient Vector^T * Jacobian matrix? Numerics	7	3492	September 11, 2024
How to think about the cost of vJ and Jv (pullback and pushforward) Numerics question , ad	6	92	May 15, 2025
Efficient pushforward (Jacobian vector product) with multiple vectors using automatic differentiation Performance autodiff	13	542	May 27, 2023

Identify important subspaces of a function without squaring the conditioning number

Related topics