Create ForwardDiff.Dual is slow

Hi all,

I am using ForwardDiff to calculate a jacobian in a physical problem. Everything works and I am happy with the results. Now I would like to optimaze it but I am failing in doing so.

As a start, I have a very large wave function that I would like to derive once and then store the results. I perform the derivation as follows

jacobian(x->F(f,x...), p)

where F is my large function, f is a vector of frequencies and p are the parameters I am interested in deriving.

This function needs to be used by another function that projects it onto a detector (actually many) so I replaced the previous step with

jacobian(x->G_1(f,x...), p)
...
jacobian(x->G_n(f,x...), p)

where each G contains F.

There is an obvious bottleneck so i tried to compute just once the derivative of F and then assemble a jacobian using


F_value = F(f,p)
F_jacobian = jacobian(x->F(f,x...), p)

function G_1(f,p, F_value, F_jacobian)

    G = Array{ForwardDiff.Dual}(undef, length(f))
    for i in 1:length(f)
        @views G[i] = ForwardDiff.Dual(F_value[i], F_jacobian[i,:]...)
    end
    # I corrected the tag in the Dual but removed here to simplify the description

    # more stuff, function of F

end
       

This is just a sketch of what happens in the real code.
The main issue is that the step of filling the vector G is very slow.

In the following pic I wrote a function working with Floats and with ForwardDiff.Dual that reproduce the thing I would like to do. The function working with Duals is 60x times slower. Any advices?

Thanks

Hi @shady287, welcome to the community!
There are several performance pitfalls with your screenshotted code (please try to make it copy-pasteable when you can btw), most of which are related to type stability:

  • When you create Vector{ForwardDiff.Dual}, this is not a concrete element type, which means the compiler cannot optimize subsequent operations.
  • When you splat F_jacobian[i, .]..., this prevents the compiler from knowing statically (just based on types) how many arguments the function Dual is called with.

Can you be a little more precise on how your functions G_i relate to F, and what the typical size of the inputs is?

Thank you very much for clarifying those points!

To add a bit of context:

  • F is a wave function and depends on some parameters p that are usually \lesssim 7
  • G is a detector projection that projects F (the wave) in the detector frame and perform a set of operations on F that depend on other parameters q (which I did not mention before). The entries of q are \lesssim 7
  • The vector of frequencies is usually order 1000 entries

I am interested in the jacobian of G_i i.e.,

jacobian(x->G_1(f,x...), [p,q])

Thanks for the reference on Type declaration, I see the issue. How could I fix it with ForwardDiff.Dual ?

P.S. I paste here the code that I took the screen of

using ForwardDiff
using BenchmarkTools

function copy_float(a, b) 
    result = [zip(a[i], b[i,:]) for i in 1:length(a)];
end

function copy_ForwardDiff(a, jacobian)
    result = Array{ForwardDiff.Dual}(undef, length(a))
    for i in 1:length(a)
        @views result[i] = ForwardDiff.Dual(a[i], jacobian[i,:]...);
    end
end

@btime copy_float(rand(1000), rand(1000,3)); # works as expected

@btime copy_ForwardDiff(rand(1000), rand(1000,3)) # much slower

Thank you

I don’t exactly follow what you are doing, but this splat is likely to be slow. The type of every Dual is dynamic. Something like this should be type-stable (where 3 is known at compile-time):

result = map((ai, row) -> Dual(ai, NTuple{3}(row)...), a, eachrow(jacobian))

Same idea as the code suggested by @mcabbott:

function copy_ForwardDiff_fast(a, jacobian)
    return map(eachindex(a)) do i  # element type inferred automatically
        ForwardDiff.Dual(a[i], jacobian[i,1], jacobian[i, 2], jacobian[i, 3]);  # known number of arguments
    end
end
julia> @btime copy_float($a, $jac); # works as expected
  33.385 μs (2003 allocations: 93.81 KiB)

julia> @btime copy_ForwardDiff($a, $jac); # much slower
  2.159 ms (28003 allocations: 867.25 KiB)

julia> @btime copy_ForwardDiff_fast($a, $jac); # even faster
  12.304 μs (3 allocations: 31.31 KiB)

However, in your case it may not be necessary to manipulate Dual numbers manually. You have two functions F(p) and G_i(F, q). Why not compute the Jacobian J_F(p) once and for all, and then apply the chain rule manually for (p, q) -> G_i(F(p), q)?

1 Like

That’s great! Thanks very much to both of you.

To reply to @gdalle, I thought about using the chain rule. However the number of parameters changes, so I wanted something more automatic. Maybe I did not give enough thought to it. Thanks for the idea.