Hello, I’m once again running into some limitation of my Julia skills
Background
I’m using forward mode automated differentiation on a function F that maps a matrix A\in\mathbb{R}^{N \times N} to \mathbb{R} F(A):\mathbb{R}^{N \times N}\mapsto \mathbb{R}. Given the function and an input matrix, I’m able to compute the gradient \nabla F with respect to each element in A:
using ForwardDiff.gradient(F,A)
. Using ForwardDiff is faster than using the analytical expressions of the partial derivatives, even for small problems (N<100).
Problem setting
Some numerical experiments for the actual application have shown that there are a lot of identical values in the matrix \nabla F - which can be (very) large - so I’d like to gain some computation time and memory by only computing the unique elements of \nabla F (and combine this with IndirectArrays).
Based on what I’ve read here, I made a MWE to calculate a specific \dfrac{\partial F}{\partial a_{i,j}} for a function F.
using ForwardDiff
N = 100
# testfunction with result initiated within function
function F(x::AbstractArray{T}) where {T}
res = zero(eltype(x))
for i in 1:size(x,1)
for j in 1:size(x,2)
res += x[i,j]^(i/j)
end
end
return res
end
# location where we want the gradient
A = rand(N,N)
# computing the gradient (for all variables) > OK
∇foo = ForwardDiff.gradient(F, A)
# computing a specific component of the gradient > OK
inds = LinearIndices((N,N))
(t,s) = (1,3) # value for which we are searching partial derivative: ∂foo/∂a_{t,s}
vv = inds[t,s] # associated linear index
subfoo = x -> F(reshape(vcat(A[1:vv-1], x, A[vv+1:end]), N,:))
res = ForwardDiff.derivative(subfoo, A[t,s])
@assert res == ∇foo[t,s]
This works, but I have the following questions:
- Could this be made more efficient? Especially the definition of
subfoo
, where a new matrix needs to be defined for each (t,s). Maybe pre-allocate this in some way? - How could this implemented in a more generic way for different functions? Below I added an additional example, where I would like to avoid a lot of matrix generations, but this is not working (commented line)
function bigfoo(t,s,foo::Function, A::AbstractArray, args...)
backup = copy(A[t,s]) # copy original value of A[t,s]
# closure - no new matrix (not working)
#localfoo = (x::Number,) -> foo(setindex!(A,x,t,s), args...)
# closure - new matrix (working, but builds a lot of matrices)
inds = LinearIndices((N,N))
(t,s) = (1,3) # value for which we are searching partial derivative: ∂foo/∂a_{t,s}
vv = inds[t,s] # associated linear index
localfoo = (x::Number,) -> foo(reshape(vcat(A[1:vv-1], x, A[vv+1:end]), N,:), args...)
res = ForwardDiff.derivative(localfoo, A[t,s])
# restore original value (used to undo the setindex!)
A[t,s] = backup
return res
end
@assert bigfoo(t,s, F, A) == ∇foo[t,s]
- Finally, what would be an efficient manner to identify the values that will be identical in \nabla F? I’m currently thinking about using an analytically determined gradient (on a smaller scale) and expand the patterns of identical values to the scale of the real problem.
All the above was done in Julia 1.6.