Why do these two similar functions lead to different timings?

DrPapa · November 16, 2018, 10:04pm

I am in the process of porting over some Matlab code that implements the Vector Field Navigation approached outline in Goncalves et al. 2010 leveraging AD. The code is working properly, but I am confused why two seemingly equivalent implementations results in significantly different timings and # of allocations.

VF_grad(), takes function gradients as its arguments, while VF_func() takes the original function and then determines the gradients.

using ForwardDiff
using StaticArrays
using LinearAlgebra
using BenchmarkTools

α1(x::AbstractArray) = (x[1]/13.0).^2.0 .+ (x[2]/7.0).^2.0 .- 1.0
α2(x::AbstractArray) = x[3]
V(x::AbstractArray)  = -sqrt(α1(x)^2.0 + α2(x)^2.0);

∇α1(x::AbstractArray) = ForwardDiff.gradient(α1,x)
∇α2(x::AbstractArray) = ForwardDiff.gradient(α2,x)
∇V(x::AbstractArray)  = ForwardDiff.gradient(V,x);

function VF_grad(x::AbstractArray, ∇α1::Function, ∇α2::Function, ∇V::Function, G::Real=1.0, H::Real=1.0)
    ∇α1x = ∇α1(x)
    ∇α2x = ∇α2(x)
    circ = ∇α1x × ∇α2x
    converge = ∇V(x)

    G*converge + H*circ
end;

function VF_func(x::AbstractArray, α1::Function, α2::Function, V::Function, G::Real=1.0, H::Real=1.0)
    ∇α1x = ForwardDiff.gradient(α1,x)
    ∇α2x = ForwardDiff.gradient(α2,x)
    circ = ∇α1x × ∇α2x
    converge = ForwardDiff.gradient(V,x)

    G*converge + H*circ
end;

## Make Data
eval_array = [SVector(rand(3)...) for i ∈ 1:750];

@btime VF_func.($eval_array, α1, α2, V)
@btime VF_grad.($eval_array, ∇α1, ∇α2, ∇V);

259.922 μs (9004 allocations: 275.58 KiB)
 53.775 μs (4 allocations: 17.77 KiB)

Can somebody explain this behavior to me? Both look to be type-stable.

For some reason this behavior seems connected to ×. Changing circ = ∇α1x × ∇α2x to circ = ∇α1x leads to

51.266 μs (4 allocations: 17.77 KiB)
51.253 μs (4 allocations: 17.77 KiB)

Also, if I use the original definition of circ but modify my return to G*converge + circ, i.e. no H* I get

51.489 μs (4 allocations: 17.77 KiB)
51.497 μs (4 allocations: 17.77 KiB)

jw3126 · November 16, 2018, 10:18pm

I think julia is not fully specializing VR_func. You could try to force to specialize on the function arguments like this:

function VF_func2(x::AbstractArray, α1::F1, α2::F2, V::F3, G::Real=1.0, H::Real=1.0) where {F1,F2,F3}
           ∇α1x = ForwardDiff.gradient(α1,x)
           ∇α2x = ForwardDiff.gradient(α2,x)
           circ = ∇α1x × ∇α2x
           converge = ForwardDiff.gradient(V,x)

           G*converge + H*circ
       end;

Note that usually julia does a good job when deciding what to specialize and you usually don’t need to force it.

kristoffer.carlsson · November 16, 2018, 10:32pm

I think the heuristic is related to whether the function argument is directly called or not in the function body but I don’t know the details.

stillyslalom · November 16, 2018, 10:34pm

Another approach: return a callable function (since you presumably won’t be changing α1, α2, and V for each x), then call it:
***edit: sped up ~50% by @inline annotations

function VF(α1, α2, V, G=1.0, H=1.0)
    @inline ∇α1(x) = ForwardDiff.gradient(α1,x)
    @inline ∇α2(x) = ForwardDiff.gradient(α2,x)
    @inline converge(x) = ForwardDiff.gradient(V,x)

    x -> G*converge(x) + H*(∇α1(x) × ∇α2(x))
end

@btime VF(α1, α2, V).($eval_array)    #  12.693 μs (7 allocations: 17.84 KiB)

DrPapa · November 19, 2018, 1:59pm

@jw3126 This works, but I am having a hard time understanding why. Are α1, α2, and V different subtypes of Function? Running typeof(α1) seems to indicate that, I get typeof(α1) returned.

Because my function requires α1, α2, and V to be callable, would I be best off using something like

where {F1<:Function,F2<:Function,F3<:Function}

Or is there a more generic way to limit to types that are “callable”. E.g. These functions should be able to handle custom types like ODESolution form DifferentialEquations.jl since they are callable, but I expect that they are not subtypes of Functions.

@kristoffer.carlsson I think you are probably correct. When I add any combination of

a = α1(x)
b = α2(x)
c = V(x)

to my function, I get reduced timings and # of allocations even though I am increasing the number of operations.

kristoffer.carlsson · November 19, 2018, 2:03pm

Yes.

That limits it to functions but prevents general callable objects. I would argue it is best to just remove the <:Function and do duck typing instead.

DrPapa · November 19, 2018, 2:04pm

I’m not sure I know what you mean. Can you elaborate? Does this mean adding logic in my functions based based on α1 being callable? Something like

isempty(methods(α1))

DrPapa · November 19, 2018, 2:11pm

This works for me too and makes sense. However, I see no speed-up using @inline vs. not. I am using Julia v0.7. I thought it was maybe due to different optimization flags but I tried using -O3, but saw no change.

kristoffer.carlsson · November 19, 2018, 2:20pm

Just assume that the thing passed in is callable and if it isn’t, there will be a runtime error.

DrPapa · November 19, 2018, 2:39pm

That’s simple enough! Thanks.

Topic		Replies	Views
Vectorization of ForwardDiff.gradient function New to Julia	7	1839	October 29, 2017
Element-wise additions behaviour Performance	9	400	January 30, 2020
Innocent looking optimization of the forward pass causes performance cliff in gradient calculation with Zygote.jl v0.4.7 Machine Learning performance	5	939	February 5, 2020
Why is this Julia code considerably slower than Matlab New to Julia performance	64	8386	March 5, 2017
Gradient of a normal (NOT vector) function that has vector arguments General Usage	10	1117	July 30, 2019

Why do these two similar functions lead to different timings?

Related topics