Closure which discards the return is slower?

using FunctionWrappers

struct CallbackF64 f::FunctionWrappers.FunctionWrapper{Void,Tuple{Array{Float64}}} end
(cb::CallbackF64)(v) = cb.f(v)

f!(u) = (u.=u.^2)
f = CallbackF64(f!)
u = Float64[1,2,3]
f(u) # Errors

g!(u) = (u.=u.^2; nothing)
g = CallbackF64(g!)
g(u)
g!(u)

h! = (u) -> (f!(u);nothing)
h = CallbackF64(h!)
h(u)
h!(u)

using BenchmarkTools
@benchmark f!($u)

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     10.949 ns (0.00% GC)
  median time:      12.083 ns (0.00% GC)
  mean time:        14.726 ns (0.00% GC)
  maximum time:     541.074 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

@benchmark g!($u)

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     10.572 ns (0.00% GC)
  median time:      11.328 ns (0.00% GC)
  mean time:        13.529 ns (0.00% GC)
  maximum time:     722.690 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

@benchmark h!($u)

BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     26.053 ns (0.00% GC)
  median time:      28.318 ns (0.00% GC)
  mean time:        37.032 ns (0.00% GC)
  maximum time:     1.934 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

Why is h so much slower?

Not sure if I’m misunderstanding you, but it doesn’t seem like that discarding the return is the reason for the performance hit but rather the difference between:
f(u) = u and f = (u) -> u

Another example:

julia> f!(u) = (u.=u.^2)
f! (generic function with 1 method)

julia> g!(u) = (f!(u))
g! (generic function with 1 method)

julia> h! = (u) -> (f!(u))
(::#3) (generic function with 1 method)

julia> u = Float64[1,2,3]
3-element Array{Float64,1}:
 1.0
 2.0
 3.0

julia> using BenchmarkTools

julia> @benchmark f!($u)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     10.922 ns (0.00% GC)
  median time:      11.223 ns (0.00% GC)
  mean time:        11.282 ns (0.00% GC)
  maximum time:     76.766 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999

julia> @benchmark g!($u)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     10.923 ns (0.00% GC)
  median time:      11.219 ns (0.00% GC)
  mean time:        11.348 ns (0.00% GC)
  maximum time:     61.750 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999

julia> @benchmark h!($u)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     29.161 ns (0.00% GC)
  median time:      29.177 ns (0.00% GC)
  mean time:        30.473 ns (0.00% GC)
  maximum time:     110.651 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     995

Also not sure why FunctionWrappers is necessary in your example, you call the in-place versions anyways.

Huh, I’m surprised that doesn’t inline the function call and end up as not cost… ?

I want to do this as an option to stop specialization on functions in DiffEq, but I cannot guarantee that a user’s mutating function will actually return nothing even if it doesn’t matter what they return, and the FunctionWrapper does care about that.

Hm okay, so does this solve the issue for now? (a combination of your g! and h!)

julia> k!(u) = (f!(u);nothing)
k! (generic function with 1 method)

julia> @benchmark k!($u)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     10.915 ns (0.00% GC)
  median time:      11.216 ns (0.00% GC)
  mean time:        11.518 ns (0.00% GC)
  maximum time:     50.934 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999

Hmm, interesting. I’ll need to see if that holds true in the scope of another function.

@benchmark $h!($u)

3 Likes

In your example, f! is a normal function, but h! is a global variable that could contain anything. You’re at least paying the cost of dynamic dispatch.

2 Likes

Yeah, that’s it. Anonymous functions aren’t global constants like named functions, so this was just a benchmarking mistake. Thanks!

1 Like