I think we should distinguish the language and the ecosystem.
As far as the language is concerned, I don’t know a case where a well written for loop would be slower than broadcasting (with a caveat, see below).
Of course it’s possible to write loops that iterate on arrays using the “wrong” order for dimensions. Maybe broadcasting can help avoid this type of mistake. Example:
# Bad iteration
function f(A)
B = similar(A)
for i in axes(A, 1)
for j in axes(A, 2)
B[i,j] = abs(A[i,j])
end
end
return B
end
# Good iteration
function g(A)
B = similar(A)
for j in axes(A, 2)
for i in axes(A, 1)
B[i,j] = abs(A[i,j])
end
end
return B
end
# Broadcasting
function h(A)
B = similar(A)
B .= abs.(A)
return B
end
julia> A = rand(3000, 3000);
julia> @btime f($A);
269.833 ms (2 allocations: 68.66 MiB)
julia> @btime g($A);
45.292 ms (2 allocations: 68.66 MiB)
julia> @btime h($A);
33.961 ms (2 allocations: 68.66 MiB)
The wrong order of iteration gives a slow loop. Broadcasting uses the right order so it’s fast.
Wait, is broadcasting faster than the “good” loop? Well these loops do almost no work, much of the time is spent in indexing so bound checking is a significant overhead. This can be fixed:
function g2(A)
B = similar(A)
@inbounds for j in axes(A, 2)
@inbounds for i in axes(A, 1)
B[i,j] = abs(A[i,j])
end
end
return B
end
julia> @btime g2($A);
33.725 ms (2 allocations: 68.66 MiB)
It’s a nice thing that broadcasting does automatically, and you don’t have to worry about illegal uses of @inbounds
. But maybe one day the compiler will make these particular @inbounds
unnecessary.
The ecosystem is a different story: there’s no guarantee that a particular package won’t have worse performance when using loops. At least Zygote is known to be slower with loops than vectorized code, see e.g. Speed of vectorized vs for-loops using Zygote - #12 by Elrod . But it’s more the exception than the rule (at least it’s the only case I know).