Hello,
while trying to optimize some Julia code I wrote, I encountered several performance issues all following the same kind of pattern, related to using higher-order functions and generator expressions nested within an outer loop.
Below is a somewhat stupid example, but I think it illustrates my point well enough is a self-contained way. These are three implementations of the same algorithm, in which two loops are involved. All timings below were obtained using BenchmarkTools
and julia -O3
.
First variant: both loops are written with for
loops.
function v1(n :: Int)
prime = Array{Bool}(undef, n)
fill!(prime, true)
for i in 2:n
for j in 2:i-1
if i%j==0
prime[i] = false
break
end
end
end
end
julia> @btime v1(1000)
307.409 ÎĽs (1 allocation: 1.06 KiB)
Second variant: the inner loop has been replaced by a call to a higher-order function (any
in this case) and a generator expression. Notice how the run time and number of allocations increase.
function v2(n :: Int)
prime = Array{Bool}(undef, n)
fill!(prime, true)
for i in 2:n
if any(i%j==0 for j in 2:i-1)
prime[i] = false
end
end
end
julia> @btime v2(1000)
4.536 ms (22201 allocations: 379.16 KiB)
I can imagine why this would be the case, since maybe temporary objects are needed to instantiate the generators, and the optimizer does not see how to avoid them. What puzzles me most is this third variant, in which both loops are using higher-order functions. Replacing the outer loop by a call to foreach
gives the performance back!
function v3(n :: Int)
prime = Array{Bool}(undef, n)
fill!(prime, true)
foreach(2:n) do i
if any(i%j==0 for j in 2:i-1)
prime[i] = false
end
end
end
julia> @btime v3(1000)
313.988 ÎĽs (1 allocation: 1.06 KiB)
Does anybody know what happens in this case? Or am I doing anything wrong? Would there be a way to improve version 2 in such a way that it can use higher-order functions to replace the inner loop, but still not kill the performance when used in a “regular” (i.e. for
or while
) outer loop?
Thanks in advance