Help with gradient

Hi,
I have a loss function that involves a probability density that requires several matrix operations to derive. When I tried running gradient on it through either Flux or Zygote I get this error: Mutating arrays is not supported. If anyone could tell me what that pertains to then perhaps I can adjust the code to comply.

thanks,
carson

Do what the error message tells you: don’t mutate arrays, use a functional style (map, comprehensions, etc).

It is difficult to help more without a concrete, self-contained example.

Also, see

https://fluxml.ai/Zygote.jl/latest/utils/#Zygote.Buffer

Thanks for your prompt reply. Sorry for not providing enough information. I think you have answered my question but here is my code https://github.com/ccc1685/transcription-model/blob/master/loss.jl. I want to find the gradient of function loss(x).

I need compute the nullspace of a matrix and thus use Julia’s linear algebra package. From what I gather, gradient cannot plow through matrix calculations. So I think this means I need to either write my own SVD solver or wait?

I would suggest providing a custom adjoint to the nullspace operation itself. I usuallly find

@inproceedings{giles2008extended,
  title={An extended collection of matrix derivative results for forward and reverse mode algorithmic differentiation. An extended version of a paper that appeared in the proceedings of AD2008},
  author={Giles, M},
  booktitle={the 5th International Conference on Automatic Differentiation},
  year=2008
}

useful

2 Likes

Thanks for your help

The other problem you will run into is that you are mutating your matrices in-place. This is easy enough to fix. For instance, the line:

r ./= rin[reactiondecay]

Should read:

r /= rin[reactiondecay]

Likewise, your transition function would need to be re-written in the same manner. That might not be straightforward in the case of your B and T transition matrices, and it may be easier to derive and define the adjoints by hand.

One trick I’ve used is to calculate the matrix construction with ForwardDiff, and then use the resulting Jacobian to calculate the vector’-Jacobian product, like so:

forward(params) = ... # constructs a matrix T from params

Zygote.@adjoint function forward(params)
    T = forward(params)
    return T, function (params_bar)
        J = ForwardDiff.jacobian(forward, params)
        # backward pass is vector-Jacobian product
        vjp = params_bar' * J
        return vjp
    end
end

It may not be the most efficient way to calculate things, but I trust it more than doing it by hand.

2 Likes

Thanks. This concept of mutating matrices is foreign to me so your explanation was very helpful.