How to speed up array comparisons?

Noel_Araujo · May 19, 2018, 8:55pm

I need to fix a bottleneck in my code during comparison of an array and a variable.
I’ve tested the map() function and it speed up my code only when I declare the number directly. If I store my number inside a variable I lost the performance.
Below we have a simple example in Julia 0.6.2. A is just an array and I need to compare it with a value stored in cte.

A = collect(1:1000) + rand(1000);
cte  = 560.5
@btime A .> 560.5
@btime A .> cte
@btime map( (l)-> l > 560.5,A)
@btime map( (l)-> l > cte,A)

In my computer, results were:

4.679 μs (21 allocations: 5.06 KiB)
3.916 μs (21 allocations: 4.95 KiB)
846.938 ns (2 allocations: 1.08 KiB)
16.998 μs (1003 allocations: 16.72 KiB)

We can see the third line has the better result. How can I define my variable cte so that the fourth line will had the same performance ?

Thank you

nalimilan · May 19, 2018, 9:00pm

You can do const cte = 560.5, or wrap the call in a function which takes cte as an argument.

Noel_Araujo · May 19, 2018, 9:07pm

It works, thank you

Seif_Shebl · May 19, 2018, 10:00pm

Or, if you need something super fast:

julia> function f(x,A)
       C = Array{Bool}(undef,length(A))
       @inbounds for i in 1:length(A)
         C[i] = ifelse(A[i]>x, true, false)
       end
       C
       end
f (generic function with 1 method)

julia> A = (1:1000) + rand(1000);
julia> cte = 560.5;
julia> @btime f($cte,$A);
  256.591 ns (1 allocation: 1.06 KiB)

DNF · May 19, 2018, 10:26pm

When benchmarking with BenchmarkTools, remember to always interpolate the variables!

julia> @btime map( (l)-> l > cte,A);
  25.785 μs (1003 allocations: 16.72 KiB)

julia> @btime map( (l)-> l > $cte,$A);
  737.640 ns (2 allocations: 1.09 KiB)

If you have weird benchmarking results, 98% of the time, it’s because you didn’t interpolate.

stevengj · May 19, 2018, 10:36pm

If array .> x is the bottleneck in your code, it’s likely you are doing something wrong. Don’t write Matlab/Numpy-style code that performs a sequence of vector operations one by one. Write a single loop that does all your processing in one pass over the array. (And put performance-critical code like this in a function.)

DNF · May 19, 2018, 10:38pm

I have seen pretty much this exact answer to basically the same question countless times. Why the reluctance to recommending variable interpolation? To me, that seems so much more convenient than creating a function. What am I missing?

stevengj · May 19, 2018, 10:42pm

Interpolation is only something that works with the BenchmarkTools macros. And writing a lot of performance critical code as one big global script is bad programming style anyway—it leads to code reuse by copy-paste and editing your code every time you need to change a parameter.

RoyiAvital · May 19, 2018, 11:42pm

Could you explain what’s variable interpolation and when does it make a difference in performance?

ChrisRackauckas · May 20, 2018, 12:26am

Interpolating it adds the literal to the expression instead of using a global which breaks type-inference.

foobar_lv2 · May 20, 2018, 12:29am

It makes no difference in practice, it is a quirk of the BenchmarkTools macros

@btime only really works in the global scope. But when you measure a function call via @btime, then you will measure both dispatch and time spent in the function, because @btime has to take your variables from the global scope (where they are untyped). This is distinct from using literals for benchmarking. Generally, whenever your timings are in usecs (or even nanoseconds), I would be wary of artifacts; such functions can really only be properly benchmarked in context (inlining and whatever your CPU ends up doing).

#on 0.62
julia> using BenchmarkTools

julia> @btime +(1,2);
  2.198 ns (0 allocations: 0 bytes)

julia> x=1;y=2;
julia> @btime +(x,y);
  23.521 ns (0 allocations: 0 bytes)
julia> @btime +($x,$y)
  2.633 ns (0 allocations: 0 bytes)

and

#on 0.7
julia> using BenchmarkTools

julia> @btime +(1,2)
  0.024 ns (0 allocations: 0 bytes)

julia> x=1;y=2;
julia> @btime +(x,y)
  29.138 ns (0 allocations: 0 bytes)
julia> @btime +($x,$y)
  2.207 ns (0 allocations: 0 bytes)

RoyiAvital · May 20, 2018, 5:44am

@foobar_lv2, I can see the difference in results, but what’s Variable Interpolation?
What’s being done exactly?

TsurHerman · May 20, 2018, 7:11am

If I understand correctly
when you are doing
@benchmark func(X)
if func and or X are not constants in the scope you are benchmarking… that is, if it is possible to change
the type of func or X (without getting a warning).

then what you are actually benchmarking in each iteration is:
tic
get the current values for func and X
apply func(X)
toc

Whereas when you add the $ sign in the benchmark macro, you are not using the variable you are using the value of that variable in the time of the macro invocation, which is probably what you wanted.

Topic		Replies	Views
Comprehension vs map and filter unexpected speeds General Usage question	22	1680	November 20, 2019
Array performance Julia 0.6 vs 0.5 Performance	11	1768	March 19, 2018
Benchmarking with interpolated variable faster than hardcoded literal value Performance	1	553	September 18, 2019
Question - for loop - variable assignments Performance	5	1172	January 29, 2020
.= vs = speed difference New to Julia	2	549	June 12, 2019

How to speed up array comparisons?

Related topics