ReverseDiff function type specification

optimization

#1

I’m using ReverseDiff in my project, but am wondering how to speed up my implementation. Specifically, I cannot specify the type of my inputs to the function I want to get the gradient of, which slows other parts of the code down where this function gets called (where I do not need a gradient evaluation).

As an example, say I have the function func:

function func( a::Vector, b::Vector )
       return sum(a .* b)
end

When I run ReverseDiff I get the following error:

ERROR: MethodError: no method matching func(::ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}, ::ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}})

Now consider the function funcFree:

function funcFree( a, b )
       return sum( a.*b )
end

With funcFree gradient works, but now the evaluation time for funcFree is a 1000x higher than for func (as is kind of expected).

Any thoughts how to ensure that I can get the gradient, but the function evaluation is still fast?


#2

The type annotations in the function does not affect performance.

What are you comparing here, given that your func does not even work?


#3

I’m comparing evaluation times of the function (without gradient), which does get impacted by the type annotations? See below.

a = randn(5); b = randn(5)
@benchmark funcFree(a,b)

which gives

BenchmarkTools.Trial: 
  memory estimate:  1.20 KiB
  allocs estimate:  27
  --------------
  minimum time:     8.312 μs (0.00% GC)
  median time:      8.718 μs (0.00% GC)
  mean time:        9.701 μs (3.01% GC)
  maximum time:     2.975 ms (98.30% GC)
  --------------
  samples:          10000
  evals/sample:     3

compared to

@benchmark func(a,b)
BenchmarkTools.Trial: 
  memory estimate:  144 bytes
  allocs estimate:  2
  --------------
  minimum time:     67.188 ns (0.00% GC)
  median time:      70.216 ns (0.00% GC)
  mean time:        107.427 ns (30.12% GC)
  maximum time:     6.770 μs (97.43% GC)
  --------------
  samples:          10000
  evals/sample:     978

This seems like a pretty big difference to me? Or what did you mean?


#4

If you don’t interpolate – i.e. @benchmark funcFree($a,$b) – you are actually benchmarking a piece of the inference machinery along with your function. Check the BenchmarkTools docs.


#5

Besides the speed questions, I think ReverseDiff needs to evaluate your function with its own array type, which is an AbstractVector but not a Vector. So if it’s necessary to restrict your function (e.g. because a different method handles Matrices) then you need ::AbstractVector.

(ForwardDiff instead uses a Vector of dual numbers.)


#6

You can also declare them constant:

julia> using BenchmarkTools

julia> function func( a::Vector, b::Vector )
              return sum(a .* b)
       end
func (generic function with 1 method)

julia> function funcFree( a, b )
              return sum( a.*b )
       end
funcFree (generic function with 1 method)

julia> const a = randn(5); const b = randn(5);

julia> @benchmark func(a, b)
BenchmarkTools.Trial: 
  memory estimate:  128 bytes
  allocs estimate:  1
  --------------
  minimum time:     28.656 ns (0.00% GC)
  median time:      31.547 ns (0.00% GC)
  mean time:        39.803 ns (17.13% GC)
  maximum time:     35.199 μs (99.89% GC)
  --------------
  samples:          10000
  evals/sample:     995

julia> @benchmark funcFree(a, b)
BenchmarkTools.Trial: 
  memory estimate:  128 bytes
  allocs estimate:  1
  --------------
  minimum time:     27.851 ns (0.00% GC)
  median time:      31.174 ns (0.00% GC)
  mean time:        39.840 ns (17.91% GC)
  maximum time:     36.366 μs (99.88% GC)
  --------------
  samples:          10000
  evals/sample:     995


julia> a2 = randn(5); b2 = randn(5);

julia> @benchmark func($a2, $b2)
BenchmarkTools.Trial: 
  memory estimate:  128 bytes
  allocs estimate:  1
  --------------
  minimum time:     27.698 ns (0.00% GC)
  median time:      31.125 ns (0.00% GC)
  mean time:        39.762 ns (18.18% GC)
  maximum time:     36.008 μs (99.86% GC)
  --------------
  samples:          10000
  evals/sample:     994

julia> @benchmark funcFree($a2, $b2)
BenchmarkTools.Trial: 
  memory estimate:  128 bytes
  allocs estimate:  1
  --------------
  minimum time:     27.428 ns (0.00% GC)
  median time:      31.466 ns (0.00% GC)
  mean time:        39.951 ns (18.25% GC)
  maximum time:     35.437 μs (99.88% GC)
  --------------
  samples:          10000
  evals/sample:     995

Also, the fact your func allocates memory adds noise to benchmark. I know your func is just a proxy, but making things allocation free is often a good way to reduce noise:

julia> @benchmark a' * b
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     8.695 ns (0.00% GC)
  median time:      8.876 ns (0.00% GC)
  mean time:        8.952 ns (0.00% GC)
  maximum time:     19.827 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999

#7

Thanks for all the replies, very helpful. Should’ve read the BenchmarkTools documentation better.