Help with gradient

ccc1685 · January 29, 2020, 6:42pm

Hi,
I have a loss function that involves a probability density that requires several matrix operations to derive. When I tried running gradient on it through either Flux or Zygote I get this error: Mutating arrays is not supported. If anyone could tell me what that pertains to then perhaps I can adjust the code to comply.

thanks,
carson

Tamas_Papp · January 30, 2020, 1:27pm

Do what the error message tells you: don’t mutate arrays, use a functional style (map, comprehensions, etc).

It is difficult to help more without a concrete, self-contained example.

Also, see

https://fluxml.ai/Zygote.jl/latest/utils/#Zygote.Buffer

ccc1685 · January 30, 2020, 7:46pm

Thanks for your prompt reply. Sorry for not providing enough information. I think you have answered my question but here is my code https://github.com/ccc1685/transcription-model/blob/master/loss.jl. I want to find the gradient of function loss(x).

I need compute the nullspace of a matrix and thus use Julia’s linear algebra package. From what I gather, gradient cannot plow through matrix calculations. So I think this means I need to either write my own SVD solver or wait?

Tamas_Papp · January 31, 2020, 10:05am

I would suggest providing a custom adjoint to the nullspace operation itself. I usuallly find

@inproceedings{giles2008extended,
  title={An extended collection of matrix derivative results for forward and reverse mode algorithmic differentiation. An extended version of a paper that appeared in the proceedings of AD2008},
  author={Giles, M},
  booktitle={the 5th International Conference on Automatic Differentiation},
  year=2008
}

useful

ccc1685 · January 31, 2020, 12:03pm

Thanks for your help

platawiec · January 31, 2020, 1:12pm

The other problem you will run into is that you are mutating your matrices in-place. This is easy enough to fix. For instance, the line:

r ./= rin[reactiondecay]

Should read:

r /= rin[reactiondecay]

Likewise, your transition function would need to be re-written in the same manner. That might not be straightforward in the case of your B and T transition matrices, and it may be easier to derive and define the adjoints by hand.

One trick I’ve used is to calculate the matrix construction with ForwardDiff, and then use the resulting Jacobian to calculate the vector’-Jacobian product, like so:

forward(params) = ... # constructs a matrix T from params

Zygote.@adjoint function forward(params)
    T = forward(params)
    return T, function (params_bar)
        J = ForwardDiff.jacobian(forward, params)
        # backward pass is vector-Jacobian product
        vjp = params_bar' * J
        return vjp
    end
end

It may not be the most efficient way to calculate things, but I trust it more than doing it by hand.

ccc1685 · January 31, 2020, 1:31pm

Thanks. This concept of mutating matrices is foreign to me so your explanation was very helpful.

Topic		Replies	Views
Gradient of a loss function : struggling to avoid arrays mutation New to Julia zygote , sciml	4	1488	December 7, 2020
Flux Zygote Gradient: Understanding Mutating arrays is not supported Machine Learning	21	4129	December 3, 2020
Gradient of gradient Machine Learning	9	1264	November 6, 2020
Gradient of jacobian in Flux v0.10 (zygote) General Usage machine-learning	0	421	February 13, 2020
Zygote.gradient(): Mutating arrays is not supported General Usage	1	786	August 18, 2020

Help with gradient

Related topics