Sparse Gradient using Zygote & Optim

I’m trying to minimize a function and i use Zygote to calculate the derivatives to minimize it with BFGS(from Optim), Can i get sparse gradient/hessiens and use them with optim. I really need to switch to sparse since only \frac{4}{n}(where n is the size of the problem) of the parameters can be non-zero. But even in the simplest case it returns a dense gradient.

As for an MWE

using Zygote
using SparseArrays

f(x)=sum(x.^2)
gradient(f, spzeros(10, 10))[1]
10×10 Array{Float64,2}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0

If you know which four parameters are non-zero, then pass them as arguments to an anonymous function and assign them to the correct places in your gradient vector. If you don’t know, then is your function actually differentiable?

Otherwise you might look at SparseDiffTools.jl.

1 Like

that was conclusion too, for the sake of completeness


function vec2sparse(vec, x)
    return SparseMatrixCSC(x.m, x.n, x.colptr, x.rowval, vec)
end
@adjoint vec2sparse(vec, x) = begin
    return vec2sparse(vec, x), c -> (c.nzval, nothing)
end
Zygote.refresh()

this is what i used, thanks!

1 Like