Anonymous function as argument cant be optimized

performance

#1

Hello, I have the following (simplified) code where I wonder why the second algorithm runs slower:

nn(x, bc) = x > 2 ? x : bc(x)

function algorithm_direct(Λ::AbstractArray{T, N}, bc::Function = x->x, bond_condition = (x, y) -> x == y) where {T, N}
	for i ∈ Λ
		bond_condition(i, nn(i, x->x))
	end
end

function algorithm(Λ::AbstractArray{T, N}, bc::Function = x->x, bond_condition = (x, y) -> x == y) where {T, N}
	for i ∈ Λ
		bond_condition(i, nn(i, bc))
	end
end

Λ = [1,2,3,4,5]

@btime algorithm_direct(Λ, x->x, (x, y) -> x == y )
@btime algorithm(Λ, x->x, (x, y) -> x == y )

This gives me:

  401.069 ns (0 allocations: 0 bytes)
  1.536 μs (0 allocations: 0 bytes)

So the question is, why the version with the function passed as parameter is slower than the version with fixed function in the call of nn.

This is a running minimal example, the actual code is more difficult but the result is the same.

In the original case @code_warntype tells me that bc is not optimized out. I would like to have it behave in this way.

Thanks…


#2

Don’t benchmark in global scope

julia> f1(l) = algorithm_direct(l, x->x, (x, y) -> x == y )
f1 (generic function with 1 method)

julia> @btime f1($Λ)
  4.164 ns (0 allocations: 0 bytes)

julia> f2(l) = algorithm(l, x->x, (x, y) -> x == y )
f2 (generic function with 1 method)

julia> @btime f2($Λ)
  3.863 ns (0 allocations: 0 bytes)

#3

Hello, thanks for this quick reply.

However, I still get different result on julia 0.6.2 with your calls:

  29.229 ns (0 allocations: 0 bytes)
  841.827 ns (0 allocations: 0 bytes)

#4

What happens if you move the creation of x->x out of the for loop in algorithm_direct?


#5

How can I do that when I need it for each call of nn?

This version behaves exactly the same way

function algorithm_direct(Λ::AbstractArray{T, N}, bc::Function = x->x, bond_condition = (x, y) -> x == y) where {T, N}
	f = x->x
	for i ∈ Λ
		bond_condition(i, nn(i, f))
	end
end

#6

Sry, my mistake, I’ve overlooked the ns/us and thought algorithm_direct was slower.