Closure which discards the return is slower?


#1
using FunctionWrappers

struct CallbackF64 f::FunctionWrappers.FunctionWrapper{Void,Tuple{Array{Float64}}} end
(cb::CallbackF64)(v) = cb.f(v)

f!(u) = (u.=u.^2)
f = CallbackF64(f!)
u = Float64[1,2,3]
f(u) # Errors

g!(u) = (u.=u.^2; nothing)
g = CallbackF64(g!)
g(u)
g!(u)

h! = (u) -> (f!(u);nothing)
h = CallbackF64(h!)
h(u)
h!(u)

using BenchmarkTools
@benchmark f!($u)

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     10.949 ns (0.00% GC)
  median time:      12.083 ns (0.00% GC)
  mean time:        14.726 ns (0.00% GC)
  maximum time:     541.074 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

@benchmark g!($u)

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     10.572 ns (0.00% GC)
  median time:      11.328 ns (0.00% GC)
  mean time:        13.529 ns (0.00% GC)
  maximum time:     722.690 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

@benchmark h!($u)

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     26.053 ns (0.00% GC)
  median time:      28.318 ns (0.00% GC)
  mean time:        37.032 ns (0.00% GC)
  maximum time:     1.934 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

Why is h so much slower?


#2

Not sure if I’m misunderstanding you, but it doesn’t seem like that discarding the return is the reason for the performance hit but rather the difference between:
f(u) = u and f = (u) -> u

Another example:

julia> f!(u) = (u.=u.^2)
f! (generic function with 1 method)

julia> g!(u) = (f!(u))
g! (generic function with 1 method)

julia> h! = (u) -> (f!(u))
(::#3) (generic function with 1 method)

julia> u = Float64[1,2,3]
3-element Array{Float64,1}:
 1.0
 2.0
 3.0

julia> using BenchmarkTools

julia> @benchmark f!($u)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     10.922 ns (0.00% GC)
  median time:      11.223 ns (0.00% GC)
  mean time:        11.282 ns (0.00% GC)
  maximum time:     76.766 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999

julia> @benchmark g!($u)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     10.923 ns (0.00% GC)
  median time:      11.219 ns (0.00% GC)
  mean time:        11.348 ns (0.00% GC)
  maximum time:     61.750 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999

julia> @benchmark h!($u)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     29.161 ns (0.00% GC)
  median time:      29.177 ns (0.00% GC)
  mean time:        30.473 ns (0.00% GC)
  maximum time:     110.651 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     995

Also not sure why FunctionWrappers is necessary in your example, you call the in-place versions anyways.


#3

Huh, I’m surprised that doesn’t inline the function call and end up as not cost… ?

I want to do this as an option to stop specialization on functions in DiffEq, but I cannot guarantee that a user’s mutating function will actually return nothing even if it doesn’t matter what they return, and the FunctionWrapper does care about that.


#4

Hm okay, so does this solve the issue for now? (a combination of your g! and h!)

julia> k!(u) = (f!(u);nothing)
k! (generic function with 1 method)

julia> @benchmark k!($u)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     10.915 ns (0.00% GC)
  median time:      11.216 ns (0.00% GC)
  mean time:        11.518 ns (0.00% GC)
  maximum time:     50.934 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999

#5

Hmm, interesting. I’ll need to see if that holds true in the scope of another function.


#6

@benchmark $h!($u)


#7

In your example, f! is a normal function, but h! is a global variable that could contain anything. You’re at least paying the cost of dynamic dispatch.


#8

Yeah, that’s it. Anonymous functions aren’t global constants like named functions, so this was just a benchmarking mistake. Thanks!