Loop faster than `reduce`?

stla · October 24, 2020, 11:16am

Hello,

Is there a way to do better than such a loop:

for t in 2:n
     jarray = jack(t, jarray)
end

I tried jarray = reduce((a, t) -> jack(t, a), 2:n; init = jarray), but this is slower than the loop.

Sukera · October 24, 2020, 12:10pm

Please provide a MWE that includes all necessery code to run it. For example, what’s jack? What’s jarray?

stla · October 24, 2020, 1:04pm

Hello,

The jack function is complex and not short. I’m afraid nobody will try to help me if I post it “as is”. I will try with a simple one to see if the behavior is the same. See you later.

Henrique_Becker · October 24, 2020, 1:09pm

I think it is a little strange that reduce is slower than the loop if jack is a complex function. If most of the processing was being done by jack calls any differences between reduce and the loop would be minimal.

ffevotte · October 24, 2020, 4:32pm

Let me first say that your operation looks to me more like a foldr than a plain reduce (unless there is some commutativity going on…)

As others said, both the for loop and the higher-order function should achieve the same kind of performance in such cases. See for example the following functions:

foo(x, y) = x=>y

function f1(n)
    res = 1
    for t in 2:n
        res = foo(t, res)
    end
    res
end

f2(n) = foldr(foo, n:-1:2, init=1)

which yield the same results:

julia> f1(5)
5 => (4 => (3 => (2 => 1)))

julia> f2(5)
5 => (4 => (3 => (2 => 1)))

and benchmark very similarly:

julia> using BenchmarkTools

julia> @benchmark f1(100)
BenchmarkTools.Trial: 
  memory estimate:  41.70 KiB
  allocs estimate:  99
  --------------
  minimum time:     15.080 μs (0.00% GC)
  median time:      16.603 μs (0.00% GC)
  mean time:        21.520 μs (5.91% GC)
  maximum time:     1.549 ms (97.01% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark f2(100)
BenchmarkTools.Trial: 
  memory estimate:  41.70 KiB
  allocs estimate:  99
  --------------
  minimum time:     16.240 μs (0.00% GC)
  median time:      17.434 μs (0.00% GC)
  mean time:        21.865 μs (6.05% GC)
  maximum time:     1.780 ms (96.66% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark f1(1000)
BenchmarkTools.Trial: 
  memory estimate:  3.90 MiB
  allocs estimate:  1488
  --------------
  minimum time:     1.195 ms (0.00% GC)
  median time:      1.297 ms (0.00% GC)
  mean time:        1.529 ms (12.72% GC)
  maximum time:     5.582 ms (57.30% GC)
  --------------
  samples:          3255
  evals/sample:     1

julia> @benchmark f2(1000)
BenchmarkTools.Trial: 
  memory estimate:  3.90 MiB
  allocs estimate:  1488
  --------------
  minimum time:     1.188 ms (0.00% GC)
  median time:      1.295 ms (0.00% GC)
  mean time:        1.554 ms (13.43% GC)
  maximum time:     5.287 ms (43.73% GC)
  --------------
  samples:          3206
  evals/sample:     1

(the same is true for even less costly operations, such as a plain floating-point sum, and should apply even more for more costly functions…)

DNF · October 24, 2020, 5:07pm

I guess this is related to the hypergeometric function you mentioned in a previous post?

You could provide a link to the repository, for those who are interested. But the best thing you could do is to try to compress the problem into a self-contained piece of code that captures the essential part of that function.

It also looks like you are passing in an array, and getting an updated version of that array out. That looks like it might have some performance issues. For example, are you updating the array in-place, or creating a new one for each iteration? How big is the array?

stla · October 31, 2020, 7:35pm

Sorry there was only 10% difference actually. This could be normal fluctuations.

DNF · October 31, 2020, 9:12pm

I’m pretty sure you can get a significant speedup if you are willing to share a bit more about what you’re doing. I have a hunch, from looking at

for t in 2:n
     jarray = jack(t, jarray)
end

that it’s really inefficient.

Topic		Replies	Views
Sum isn't faster than a loop? New to Julia	11	1816	June 24, 2020
For loop Performance Performance question	8	1636	January 11, 2020
Reduce vs. foldl: performance and precision Performance question , unrolling	11	5011	July 17, 2020
Why is copying using a loop is much slower than `copy` for large arrays? Performance question , copy	10	1398	November 27, 2022
Can this loop be optimized further? Performance question	7	738	December 20, 2019

Loop faster than `reduce`?

Related topics