When should I write loops or vectorised calls?

sijo · November 26, 2020, 5:36pm

I think we should distinguish the language and the ecosystem.

As far as the language is concerned, I don’t know a case where a well written for loop would be slower than broadcasting (with a caveat, see below).

Of course it’s possible to write loops that iterate on arrays using the “wrong” order for dimensions. Maybe broadcasting can help avoid this type of mistake. Example:

# Bad iteration
function f(A)
    B = similar(A)
    for i in axes(A, 1)
        for j in axes(A, 2)
            B[i,j] = abs(A[i,j])
        end
    end
    return B
end

# Good iteration
function g(A)
    B = similar(A)
    for j in axes(A, 2)
        for i in axes(A, 1)
            B[i,j] = abs(A[i,j])
        end
    end
    return B
end

# Broadcasting
function h(A)
    B = similar(A)
    B .= abs.(A)
    return B
end

julia> A = rand(3000, 3000);

julia> @btime f($A);
  269.833 ms (2 allocations: 68.66 MiB)

julia> @btime g($A);
  45.292 ms (2 allocations: 68.66 MiB)

julia> @btime h($A);
  33.961 ms (2 allocations: 68.66 MiB)

The wrong order of iteration gives a slow loop. Broadcasting uses the right order so it’s fast.

Wait, is broadcasting faster than the “good” loop? Well these loops do almost no work, much of the time is spent in indexing so bound checking is a significant overhead. This can be fixed:

function g2(A)
    B = similar(A)
    @inbounds for j in axes(A, 2)
        @inbounds for i in axes(A, 1)
            B[i,j] = abs(A[i,j])
        end
    end
    return B
end

julia> @btime g2($A);
  33.725 ms (2 allocations: 68.66 MiB)

It’s a nice thing that broadcasting does automatically, and you don’t have to worry about illegal uses of @inbounds. But maybe one day the compiler will make these particular @inbounds unnecessary.

The ecosystem is a different story: there’s no guarantee that a particular package won’t have worse performance when using loops. At least Zygote is known to be slower with loops than vectorized code, see e.g. Speed of vectorized vs for-loops using Zygote - #12 by Elrod . But it’s more the exception than the rule (at least it’s the only case I know).

Topic		Replies	Views
Performance of simple broadcasting operations with many arguments Performance performance , broadcast	15	1592	November 29, 2021
Loops vs Vectorization? When to use? New to Julia performance	18	4547	February 17, 2023
[ANN] LoopVectorization Package Announcements	157	23284	May 27, 2020
Loop vs vectorization Performance	4	1480	October 10, 2023
Comparison between Loops and Vectorization in Julia General Usage	3	534	May 3, 2024

When should I write loops or vectorised calls?

Related topics