# Computing null space of a Flux Chain

I have trained a Flux Chain with Dense layers. I would like to determine the null space (the space of input vectors for which the output is zeros(s) ) for this Chain. I am looking for a Julia like way to determine the projection matrix from this chain. However, I am open to other approaches to determine this.

Thanks
Jagir

Hmm, interesting. I wonder how this is done in other languages.

So you are solving

f(x) = 0 where f is your neural network of dense layers.

So

act(M_n*...*act(M_3*act(M_2*act(M_1*x)))) = 0

I never thought this would be easy anywhere. I would run like a optimization algorithm

f(x)^2 and try to get it to 0 by starting at random x I can find a new points, but I donâ€™t if itâ€™s possible to find a â€śspaceâ€ť. Sounds very hard.

1 Like

Something like this works for me

using Flux, Optim

m = Chain(
Dense(2, 8, relu),
Dense(8, 8, relu),
Dense(8, 8)
)

loss(x) = sum(x->x^2, m(x))

function make_null_sample(n)
null_space_sample = Vector{Float32}[]

for i in 1:n
result = optimize(loss, rand(2))
push!(null_space_sample, result.minimizer)
end
null_space_sample
end

@time null_space_sample = make_null_sample(1000)

null_space_sample_x = [x for (x, _) in null_space_sample]
null_space_sample_y = [y for (_, y) in null_space_sample]

using Plots

scatter(null_space_sample_x, null_space_sample_y)


Relu fixes 0, so the nullspace of the first layer embeds into the kernel of the larger chain. In addition you get anything that is completely negative and anything thatâ€™s in the nullspace of the subsequent layer. Since these layers are linear you can pull them back to the input. as a result, you can definitely compute the kernel analytically for simple networks like this, but note that it wonâ€™t be a vector space.

OK so it might be easier for relu activation. But op did not specify an activation. So I wonder if op is after a general solution

Yes, of course. Itâ€™s not tractable analytically on general.

1 Like

Let \mathbf{F}(\mathbf{x}) be the neural network function with elements f_i(\mathbf{x}). If you want a â€ślinearâ€ť subspace then first you need a point \mathbf{x}^* in that subspace where \mathbf{F}(\mathbf{x}^*) = \mathbf{0}. Then you need to find a set of vectors \Delta \mathbf{x}_j where (\nabla_{\mathbf{x}} \mathbf{F}(\mathbf{x}^*))^T (\Delta \mathbf{x}_j) = 0 where \nabla_{\mathbf{x}} \mathbf{F} (\mathbf{x}^*) is the Jacobian of \mathbf{F} wrt \mathbf{x} at \mathbf{x}^*. This will give you the â€ślinear nullspaceâ€ť at \mathbf{x}^*. To make sure it is the nullspace at neighboring points also, a sufficient but not necessary condition is \mathbf{H}^i_{\mathbf{x} \mathbf{x}} (\mathbf{x}^*) \Delta \mathbf{x}_j = 0 for each nullspace basis \Delta \mathbf{x}_j and each function f_i where \mathbf{H}^i_{\mathbf{x} \mathbf{x}} (\mathbf{x}^*) is the Hessian of f_i wrt \mathbf{x} at \mathbf{x}^*. The RELU function is piecewise linear so the Hessians will all be 0s. So you only need a point and the nullspace of the Jacobian transpose.

Edit: the necessary condition can be written using a 3rd order tensor reduction but to avoid that I only used the sufficient condition above.

1 Like

Hi Xiaodai
Thanks for the suggestion. I had this in mind but as you have mentioned this is an expensive approach to approximate the null space.

Hi Keno
Thanks for sharing this observation. I am using swish, which does have this property. However, it is a nice approach to handle such problems.