Hi, I’m new to Julia and looking to learn more about Autodiff in julia. I’ve matrix multiplication and want to get the derivative of a matrix wrt to a matrix.
For example in the equations below, I want to compute \frac{\partial X_4}{\partial X_1}, where both X_1 and X_4 are matrices (W_i are also matrices and f is some function applied to each element of the matrix) .
This is very similar to NN, but here my output is not a scalar. I want to use forward mode autodiff for this however I can’t figure out how to do this using ForwardDiff package.
Apart from this, it would be great if someone can explain how chain rule would work in this case (I’m familiar with reverse mode autodiff where output is scalar but I can’t figure out the mathematics behind something similar but in forward mode).
where \odot denotes the elementwise (Hadamard) product.
Now, the question is, when you ask for the derivative (Jacobian) of the output, what do you want? If you want the directional derivative \delta X_4 in a certain input direction \delta X_0, you can directly read it off above. If you want the Jacobian “matrix”, in some sense it’s “really” a rank-4 tensor (4d array). ForwardDiff gives you the Jacobian assuming the input and output are “flattened” using vec. In either case you need to do a bit of multilinear algebra to work it out by hand.
In my problem, X_0 is n \times 2 and X_4 is n \times 1, so what I want is \frac{\partial X_4(i)}{\partial X_0(i,1)} and \frac{\partial X_4(i)}{\partial X_0(i,2)}. I believe both of these would be n \times 1 vectors (here i=1:n).