Making iterators as fast as loops

Let me reiterate – the biggest problem here is that in the loop version, the compiler is able to simplify the inner cubic complexity loop to a constant formula. Similar to k += (i^3 - i^2 + 1) * (i^3 + i^2 + 2i) ÷ 2. This does not happen in the iterator version. So the loop version does O(n) operations, while the iterator version does O(n^4) operations. The test is flawed.

For a more accurate test, we can for example do array indexing in the inner loop. (Don’t worry about the memory access, once the memory is cached, it has very low overhead.)

function test_loop(n, v)
    k = zero(eltype(v))
    for i = 1:n, j = i^2:i^3
        k += @inbounds v[i + j]
    end
    return k
end

function test_iter(n, v)
    k = zero(eltype(v))
    for (i, j) = MyIter(1:n)
        k += @inbounds v[i + j]
    end
    return k
end

Comparing performance:

julia> n = 10;

julia> v = rand(n^3 + 10n);

julia> @btime test_loop($n, $v)
  3.032 μs (0 allocations: 0 bytes)
1314.111504915699

julia> @btime test_iter($n, $v)
  3.953 μs (0 allocations: 0 bytes)
1314.111504915699

So the difference is not very big. As has been pointed out a few times above, a secondary issue is that the endstate comparison is inefficient. Let’s fix that:

-    s == endstate && return nothing
+    s.i == endstate.i && return nothing

Timing again:

julia> @btime test_iter($n, $v)
  3.134 μs (0 allocations: 0 bytes)
1314.111504915699

This is around 3 % slower than the loop version. If the iterator is a better fit for your problem than a loop, I think this should be good enough. To sanity check, note that the inner loop is hit 2650 times for n = 10, which corresponds to 3134 / 2650 ≈ 1.2 ns per inner loop, or 3.4 clock cycles on my 2.9 GHz system. The overhead of the iterator version over the loop version is 0.11 clock cycles per iteration.

(A word of caution: The above benchmarks assume that the runtime is dominated by the inner iteration (i^2 : i^3). If this is not the case, the iterator version will be a bit slower.)

10 Likes