Benchmarking all() vs for-loop

RobertGregg · July 19, 2022, 6:10pm

Is it possible to get the same performance from the all() function compared to the for-loop equivalent?

v = rand(1:2,1_000_000)
idx = findall(isequal(1),v)

function f1(v,idx)
    @inbounds for i in idx
        if v[i] ≠ 1
            return false
        end
    end

    return true
end

function f2(v,idx)
    return @inbounds all(isequal(1), view(v,idx))
end

Benchmarking for me shows f1() taking ~211μs and f2() ~304μs, both with no allocations. Is the all() function just doing extra checks? This is mostly a curiosity, but I’d rather have a one-liner as opposed to a whole other function.

lmiq · July 19, 2022, 11:04pm

I think you are paying the price of creating the view:

julia> function f1(v,idx)
           @inbounds for i in idx
               if v[i] ≠ 1
                   return false
               end
           end
           return true
       end
f1 (generic function with 1 method)

julia> f2(v,idx) = all(isequal(1), view(v,idx))
f2 (generic function with 1 method)

julia> function f3(v,idx)
           vv = @view(v[idx])
           for el in vv
               if el ≠ 1
                   return false
               end
           end
           return true
       end
f3 (generic function with 1 method)

julia> @btime f1($v, $idx)
  570.033 μs (0 allocations: 0 bytes)
true

julia> @btime f2($v, $idx)
  952.366 μs (0 allocations: 0 bytes)
true

julia> @btime f3($v, $idx)
  926.045 μs (0 allocations: 0 bytes)
true

which suggests this alternative:

julia> f4(v,idx) = all(@inbounds(v[i] == 1) for i in idx)
f4 (generic function with 1 method)

julia> f4(v,idx)
true

julia> @btime f4($v,$idx)
  567.193 μs (0 allocations: 0 bytes)
true

RobertGregg · July 20, 2022, 12:51am

Gotta love generators . I guess I was too focused on the all(predicate, A) syntax.

I also think of views as being virtually zero cost, but I guess that usually negligible amount can crop up in cases like this.

giordano · July 20, 2022, 12:53am

I think that’s true for contiguous areas of memory, which is not the case with your idx collection of indices:

julia> @btime @view($v[$idx]);
  97.791 μs (0 allocations: 0 bytes)

julia> @btime @view($v[:]);
  1.458 ns (0 allocations: 0 bytes)

julia> @btime @view($v[begin:end÷2]);
  1.791 ns (0 allocations: 0 bytes)

lmiq · July 20, 2022, 1:02am

Yeah, the fact that that works well even when there is a non-true early element seems almost magical.

For what is worth, this also works:

julia> f5(v, idx) = all(isequal(1), @inbounds(v[i]) for i in idx)
f5 (generic function with 1 method)

julia> @btime f5($v,$idx)
  573.287 μs (0 allocations: 0 bytes)
true

Topic		Replies	Views
Benchmarking questions General Usage benchmark	3	318	September 18, 2023
A safe inbounds use with great performance effect Performance	1	434	May 19, 2022
Findall slow General Usage	8	1887	October 24, 2019
Comparison performance with `any` Performance	5	617	January 11, 2019
Comparing performance of 2 simple averaging functions - why is one faster? Performance	5	481	August 31, 2020

Benchmarking all() vs for-loop

Related topics