Hellp with more efficient function using ForwardDiff

Hi, I have a struct with either arrays or scalars which I use as input in a function with broadcast. I then need the differential of this function with respect to a value and I need this per “row” of the output of the function, i.e. not the differential of each row with respect to one value, but a different value (x) per row. Here is a MWE:

using ForwardDiff

mutable struct pa
    a
    b
end
# function can take mixed scalars or arrays (as long as all arrays are equal length)
p=pa(0.3,[0.3, 0.2])
p=pa([0.9, 0.3],[0.3, 0.2])

function U(x,p::pa)
    x.+(p.a.*x).^2 .*p.b.*x
end

function ΔU(x,p)
    ForwardDiff.derivative.(x -> U(x,p), x)
end

x=[0.1,0.2]
# to get diff at one point for all:
ΔU(x[1],p)
# this gives an array of arrays
ΔU(x,p)

# this gives the output I want but the calculations of ΔU are made length(x)^2 times instead of length(x) since each call to ΔU gives the full array but I only need one value
[ΔU(x[i],p)[i] for i in eachindex(x)]

What I am looking for is an efficient way to calculate the last row above without having to do too many unnecessary calculations. I hope the issue is clear enough?

In Julia, it’s very often helpful to be clear about what your scalar operation is, without trying to mix it in with the vectorized or broadcasted version of that operation. In particular, I would suggest:

  1. Redefine your U() to simply be a scalar function of scalars (x, a, b) or (x, pa) where pa contains only scalar a and b.
  2. To call U() on a vector of data, put the broadcast outside of U(), as in U.(x, p.a, p.b).
  3. Now your gradient is justForwardDiff.gradient(x -> U.(x, p.a, p.b), x)

Finally, I would suggest making the fields of the pa struct concretely typed, as this will improve performance of accessing those fields significantly.

3 Likes

Many thanks, that makes very much sense