Benchmarking all() vs for-loop

Is it possible to get the same performance from the all() function compared to the for-loop equivalent?

v = rand(1:2,1_000_000)
idx = findall(isequal(1),v)

function f1(v,idx)
    @inbounds for i in idx
        if v[i] ≠ 1
            return false
        end
    end

    return true
end

function f2(v,idx)
    return @inbounds all(isequal(1), view(v,idx))
end

Benchmarking for me shows f1() taking ~211μs and f2() ~304μs, both with no allocations. Is the all() function just doing extra checks? This is mostly a curiosity, but I’d rather have a one-liner as opposed to a whole other function.

I think you are paying the price of creating the view:

julia> function f1(v,idx)
           @inbounds for i in idx
               if v[i] ≠ 1
                   return false
               end
           end
           return true
       end
f1 (generic function with 1 method)

julia> f2(v,idx) = all(isequal(1), view(v,idx))
f2 (generic function with 1 method)

julia> function f3(v,idx)
           vv = @view(v[idx])
           for el in vv
               if el ≠ 1
                   return false
               end
           end
           return true
       end
f3 (generic function with 1 method)

julia> @btime f1($v, $idx)
  570.033 μs (0 allocations: 0 bytes)
true

julia> @btime f2($v, $idx)
  952.366 μs (0 allocations: 0 bytes)
true

julia> @btime f3($v, $idx)
  926.045 μs (0 allocations: 0 bytes)
true

which suggests this alternative:

julia> f4(v,idx) = all(@inbounds(v[i] == 1) for i in idx)
f4 (generic function with 1 method)

julia> f4(v,idx)
true

julia> @btime f4($v,$idx)
  567.193 μs (0 allocations: 0 bytes)
true
1 Like

Gotta love generators :smile:. I guess I was too focused on the all(predicate, A) syntax.

I also think of views as being virtually zero cost, but I guess that usually negligible amount can crop up in cases like this.

1 Like

I think that’s true for contiguous areas of memory, which is not the case with your idx collection of indices:

julia> @btime @view($v[$idx]);
  97.791 μs (0 allocations: 0 bytes)

julia> @btime @view($v[:]);
  1.458 ns (0 allocations: 0 bytes)

julia> @btime @view($v[begin:end÷2]);
  1.791 ns (0 allocations: 0 bytes)
2 Likes

Yeah, the fact that that works well even when there is a non-true early element seems almost magical.

For what is worth, this also works:

julia> f5(v, idx) = all(isequal(1), @inbounds(v[i]) for i in idx)
f5 (generic function with 1 method)

julia> @btime f5($v,$idx)
  573.287 μs (0 allocations: 0 bytes)
true