Comparison performance with `any`


#1

Hey!

How can I prevent this loop from allocating memory on every iteration? If I interchange @view b[i-1:i] with just b the loop does not allocate but that’s not what I want :wink:.

I don’t really understand this behaviour.

using BenchmarkTools

function foo(a,b)
    t = 0.99
    for i = 2:length(b)
        if any(x -> x > t, @view b[i-1:i])
        end
    end
end

a = rand(10)
b = rand( 9)

@btime foo($a, $b)

gives

410.315 ns (8 allocations: 384 bytes)

#2

Currently, creating an array view often allocates a small object (https://github.com/JuliaLang/julia/issues/14955). This is not an issue for operating on large chunks of an array, but it is significant for creating many tiny views. Eventually, there will probably be a compiler optimization that eliminates this allocation.

Of course, in your case, you don’t need @view or any at all. If the code you posted is representative of what you actually want to do, then just test if b[i-1] > t || b[i] > t.


#3

In order to eliminate the need for a view, you could also use a generator expression such as this:

any(b[j] > t for j in i-1:i)

or this (which I personally find slightly less easy to read but is closer to your original example):

any(x->x>t, b[j] for j in i-1:i)

Here is your full example rewritten in this way:

using BenchmarkTools

function foo(a,b)
    t = 0.99
    for i = 2:length(b)
        if any(b[j] > t for j in i-1:i)
        end
    end
end
julia> a = rand(10);
julia> b = rand( 9);
julia> @btime foo($a, $b)
  28.349 ns (0 allocations: 0 bytes)

By comparison, your original example benchmarked like this on my machine:

julia> @btime foo($a, $b)
  95.541 ns (8 allocations: 384 bytes)

#4

You could also use uview from https://github.com/oschulz/UnsafeArrays.jl to get to zero allocation. It benchmarks about 2x faster on my machine (and a little slower than @ffevotte’s solution). By the way, I’m a little worried about this benchmark setup; ideally the compiler would just optimize this whole loop away.


#5

Alternatively, if the length is always known at code-writing-time (eg, 2):

julia> function foo2(a,b)
           t = 0.99
           @inbounds for i = 2:length(b)
               if any(x -> x > t, (b[i-1], b[i]))
               end
           end
       end
foo2 (generic function with 1 method)

julia> @btime foo2($a, $b)
  10.270 ns (0 allocations: 0 bytes)

Unfortunately, to make it work at compile time, you’d need generated functions because of inference failure on ntuple:

julia> function foo4(a,b)
           t = 0.99
           @inbounds for i = 2:length(b)
               
               if any(x -> x > t, ntuple(j -> b[i+j-2], Val(2)))
               end
           end
       end
foo4 (generic function with 1 method)

julia> @btime foo4($a, $b)
  3.181 μs (56 allocations: 1.25 KiB)

But if we want to take that approach, the function is simple enough for the compiler to cheat:

julia> @generated function foo5(a,b,::Val{N} = Val(2)) where N  
           quote
               t = 0.99
               @inbounds for i = 2:length(b)
                   if any((Base.Cartesian.@ntuple $N j -> b[i+j-2] > t))
                   end
               end
           end
       end
foo4 (generic function with 2 methods)

julia> @btime foo5($a, $b)
  1.202 ns (0 allocations: 0 bytes)

julia> @btime foo4($a, $b)
  1.482 ns (0 allocations: 0 bytes)

julia> a = rand(10^6);

julia> b = rand( 10^6 - 1);

julia> @btime foo5($a, $b)
  1.482 ns (0 allocations: 0 bytes)

because now the compiler realizes the code isn’t actually doing anything.


#6

Thanks to everyone for the great advice! It really helped a lot :slight_smile: