For loop Performance

sam1988 · January 10, 2020, 9:10pm

Can somebody help me in speeding up following function

function mean_loop(p::AbstractArray,q::AbstractArray)
a = p;
a1 = q;
z=0.0;
for i in 1:length(a1)
s=0.0;
@inbounds s= mean(a .> a1[i]);
z+= s;
s=0.0
end
return z
end
@time mean_loop(rand(10^6),rand(10^6)) :902.126519 secs

improbable22 · January 10, 2020, 9:24pm

This isn’t horribly slow, but I think it can be much more compact:

julia> mean_cast(p,q) = sum(p .> q') / length(p);

julia> mean_iter(p,q) = sum(p1>q1 for p1 in p, q1 in q) / length(p);

julia> using BenchmarkTools

julia> @btime mean_loop($(rand(10^4)),$(rand(10^4)));
  107.303 ms (30000 allocations: 54.17 MiB)

julia> @btime mean_cast($(rand(10^4)),$(rand(10^4)));
  128.281 ms (4 allocations: 11.93 MiB)

julia> @btime mean_iter($(rand(10^4)),$(rand(10^4)));
  76.649 ms (7 allocations: 160 bytes)

GunnarFarneback · January 10, 2020, 9:33pm

Try this:

function mean_loop(p::AbstractArray, q::AbstractArray)
    z = 0.0
    for x in q
        z += mean(t -> t > x, p)
    end
    return z
end

improbable22 · January 10, 2020, 9:55pm

Nicely done, that’s about 7x faster than my mean_iter. Which is a little surprising to me, don’t they iterate through the same things?

Mason · January 10, 2020, 9:57pm

Welcome to the JuliaLang discourse @sam1988!

I’d recommend reading Please read: make it easier to help you for advice on how to format your code and ask questions in a way that makes it more likely for you to get high quality responses which are also useful to others in the future (thought it appears you’ve already got some good replies!).

Furthermore, when benchmarking your code, using @time can be very misleading. In the example you gave, @time mean_loop(rand(10^6),rand(10^6)) will include the time it takes to compile the function mean_loop and the time it takes to create the arrays rand(10^6) and rand(10^6), so most of the time you’re measuring is not actually code relevant to mean_loop. Instead, I would use the @btime macro from BenchmarkTools.jl to properly measure performance like this:

using BenchmarkTools
@btime mean_loop($(rand(10^6)), $(rand(10^6)))

The @btime macro will not include compile time in it’s benchmarks, and it runs the benchmarking function many times so it can give statistically significant results). When I write $(rand(10^6)) inside @btime, that means that I am actually precomputing the array rand(10^6) and pasting the value in as a constant to the function before the benchmark loop is run.

GunnarFarneback · January 10, 2020, 10:37pm

I guess there’s some kind of overhead to the nested loop generator. Splitting it into two generators seems faster.

mean_iter2(p, q) = sum(sum(p1 > q1 for p1 in p) for q1 in q) / length(p)

DNF · January 10, 2020, 11:37pm

You can go a bit faster with integer arithmetic:

function mean_loop_int(p::AbstractArray, q::AbstractArray)
    z = 0
    for x in q
        z += count(>(x), p)
    end
    return z / length(p) 
end

julia> x = rand(10^4);

julia> y = rand(10^4);

julia> @btime mean_loop($x, $y)
  19.034 ms (0 allocations: 0 bytes)
4968.276699999993

julia> @btime mean_loop_int($x, $y)
  15.797 ms (0 allocations: 0 bytes)
4968.2767

But if you change the algorithm, you can get significant speedups:

function mean_sorted(p, q)
    ps = sort(p)
    Lq = length(q)
    z = 0
    for x in q
        ind = searchsortedfirst(ps, x)
        z += Lq - ind + 1
    end
    return z / length(p)
end

julia> @btime mean_sorted($x, $y)
  1.512 ms (2 allocations: 78.20 KiB)
4968.2767

Possibly, it can be even faster if we sort both arrays and search for the next larger at each iteration, but I couldn’t immediately find a searchsortednext function.

This also scales well for longer vectors. For 10^6 length I get:

julia> @btime mean_sorted(x, y) setup=(x=rand(10^6); y=rand(10^6))
  395.748 ms (2 allocations: 7.63 MiB)
500344.792228

That is, N \log N complexity, rather than N^2.

tkf · January 11, 2020, 12:03am

How about using view?

DNF · January 11, 2020, 12:16am

Yeah, I guess.

Topic		Replies	Views
loops_Time New to Julia	1	191	April 23, 2023
Comprehension vs map and filter unexpected speeds General Usage question	22	1717	November 20, 2019
Benchmarking all() vs for-loop General Usage question , benchmark	4	499	July 20, 2022
What makes Julia loops so fast? General Usage question	8	1059	December 20, 2021
Efficient looping statement in Julia Performance	15	5601	January 10, 2019

For loop Performance

Related topics