I am trying to find the derivative of a function (that has input as the converged values of Jacobi iterator), with respect to the elements of the matrices used in Jacobi iterations. For example, if f(x) is a function of x, which in turn comes from Ax = b, I am trying to find the derivative of f with respect to element of A and b.
I am using reverseDiff.jl and forwardDiff.jl for this purpose, but all I can get is derivative of f with respect to the converged x, even when I pass b as parameter in the GradientConfig and GradientTape functions.
I have done this in C++ using CoDiPack library, but no success in Julia. Is this even possible in Julia? Any help would be appreciated. Thanks.
Iâm not sure what you mean, but if x is the solution to Ax=b then you can compute analytically the derivative of x wrt A and b, which would probably be more efficient than reverse differentiation on the iteration. You can possibly even use more clever tricks like adjoint methods if you do this several times.
I canât find it analytically, reason being, it is an aerodynamic application. There is fine distribution of points that define the shape of aerofoil. An iterative solver solves some differential equations iteratively with these points as inputs (which are floating point numbers) to get coefficients of drag and lift, and the sensitivities of these coefficients are computed with respect to each point on the surface of aerofoil.
I was using Jacobi as a test case, in which case, A and b would comprise of floating point numbers. Finite differences does not give resuts with desired precision.
It works with the C++ library I am using, just curious if it can be done in Julia.
I am sorry for not being clear.
If someone can shed some light on how to find derivative of f(x(b)) wrt b, with doing anything analytically.
All I get is derivative of f(b) wrt b when using :
b = 2
x = b^2 + b;
f(x) = sin(x) + x^2;
result = ForwardDiff.derivative(f,b)
Independently from the technical aspects of getting this to work, my point was that if you want to compute the derivative of f with respect to b in this way, you have to compute the derivative of x with respect to b as an intermediate step, which is expensive. But because x solves a linear system, the derivative simplifies. Ie let F(b) = f(x(b)), you want to compute the gradient of F, which is J^T grad f, where J = A^-1 is the jacobian of x wrt b. Therefore, the correct way to compute this is to compute grad f, and solve the linear system A^T grad F = grad f by an iterative method.
Note that this is called the the adjoint system of equations. The adjoint method is the same basic idea as reverse-mode auto-differentiation, but when you are using an iterative solver often auto-differentiation tools donât work well (unless they know specifically about the solver being used), and then you have to apply the adjoint method âmanuallyâ.
Thanks for the very nice write-up! Itâs a useful trick that many people and communities use in some form or other and everybody should know about but is not often covered in a general setting in textbooks (Schur complements seem to be another instance of that)
ForwardDiff/ReverseDiff differentiates native Julia functions at the input(s) you provide. When you call ReverseDiff.gradient(g, v) for some function g and some array v, youâre saying âreturn the gradient of g evaluated at vâ.
Correct! Itâs evaluating the gradient of f at the value b.
As @Tamas_Papp mentioned, it seems like you want to differentiate through some intermediary function that youâre not actually calling. Note that x (as youâve defined it) isnât a Julia function, itâs a Vector{Float64}.
Is the below closer to what youâd like to do?
using IterativeSolvers, ReverseDiff
A = [10.0 -1.0 2.0 0.0;
-1.0 11.0 -1.0 3.0;
2.0 -1.0 10.0 -1.0;
0.0 3.0 -1.0 8.0]
b = [6.0 ,25.0 ,-11.0 ,15.0]
f(x) = (x[1]^3) + log10((x[2]^2)) + exp(x[3]) + 4*x[4]
# this is the function you seem to actually want to differentiate
g(A, b) = f(jacobi(A, b))
# get gradients w.r.t. `A` and `b`
dA, db = ReverseDiff.gradient(g, (A, b))
You can, of course, preallocate memory/prerecord the tape etc. to make the above more efficient, but before we go that far, Iâm just making sure that this is closer to your intention.