Hellp with more efficient function using ForwardDiff


#1

Hi, I have a struct with either arrays or scalars which I use as input in a function with broadcast. I then need the differential of this function with respect to a value and I need this per “row” of the output of the function, i.e. not the differential of each row with respect to one value, but a different value (x) per row. Here is a MWE:

using ForwardDiff

mutable struct pa
    a
    b
end
# function can take mixed scalars or arrays (as long as all arrays are equal length)
p=pa(0.3,[0.3, 0.2])
p=pa([0.9, 0.3],[0.3, 0.2])

function U(x,p::pa)
    x.+(p.a.*x).^2 .*p.b.*x
end

function ΔU(x,p)
    ForwardDiff.derivative.(x -> U(x,p), x)
end

x=[0.1,0.2]
# to get diff at one point for all:
ΔU(x[1],p)
# this gives an array of arrays
ΔU(x,p)

# this gives the output I want but the calculations of ΔU are made length(x)^2 times instead of length(x) since each call to ΔU gives the full array but I only need one value
[ΔU(x[i],p)[i] for i in eachindex(x)]

What I am looking for is an efficient way to calculate the last row above without having to do too many unnecessary calculations. I hope the issue is clear enough?


#2

In Julia, it’s very often helpful to be clear about what your scalar operation is, without trying to mix it in with the vectorized or broadcasted version of that operation. In particular, I would suggest:

  1. Redefine your U() to simply be a scalar function of scalars (x, a, b) or (x, pa) where pa contains only scalar a and b.
  2. To call U() on a vector of data, put the broadcast outside of U(), as in U.(x, p.a, p.b).
  3. Now your gradient is justForwardDiff.gradient(x -> U.(x, p.a, p.b), x)

Finally, I would suggest making the fields of the pa struct concretely typed, as this will improve performance of accessing those fields significantly.


#3

Many thanks, that makes very much sense