Hello,
I would like to find the important directions in the input and output spaces for a \mu-measurable function f: \mathbb{R}^n \xrightarrow{} \mathbb{R}^d.
One way to identify these directions is to look at the eigenvalue decomposition of the integrated inner and outer product of the Jacobian of the function f integrated over the distribution \mu:
C_x = \int (\nabla_x f(x))^\top (\nabla_x f(x)) d\mu(x) \in \mathbb{R}^{n,n} and C_y = \int (\nabla_x f(x)) (\nabla_x f(x))^\top d\mu(x) \in \mathbb{R}^{d,d}
In practice, we can perform a Monte Carlo appproximation of C_x, C_y given M samples x^i \sim \mu:
C_x \approx \frac{1}{M}\sum_{i=1}^M (\nabla_x f(x^i))^\top (\nabla_x f(x^i)) and similarly for C_y
Since C_x is positive definite, an eigendecomposition gives us:
C_x = V \Lambda_x V^T with V\in \mathbb{R}^{n,n}, where V an orthonormal basis for the input space. We can truncate this basis to identify the most important directions based on the decay of energy spectrum of \Lambda_x.
Similarly, we write C_y = U \Lambda_y U^T with U\in \mathbb{R}^{d,d}. U is an orthonormal basis for the output space.
However, this computation is not recommended because it squares the conditioning number. One could perform a SVD of \frac{1}{\sqrt{M}} \left[ \nabla f(x^1), \ldots, \nabla f(x^M) \right] and \frac{1}{\sqrt{M}} \left[ \nabla f(x^1)^\top, \ldots, \nabla f(x^M)^\top \right] and extract the left singulars to get U and V. However assembling these stacked matrices require a lot of storage.
Is there a better way to do this computation without assembling these large matrices? Are you aware of other algorithms that identify these important directions?