I’ve been playing with writing things in different forms, for example the following:
function f_by_loop!(data,x)
@inbounds for i in eachindex(data)
data[i] = min(x,data[i])
end
end
vs data .= min.(x,data) or map!(y -> min(y,x), data, data).
where, for example, data=randn(100_000_000) and x=1.0.
These do the same thing and once I added @inbounds to the loop, all three performed similarly.
The broadcast form is definitely the most concise and I’m guessing the least error prone.
My question is: Is broadcast also preferable to the manual loop because it does not unnecessarily specify an order of operation, making it easier for the compiler to optimize?
Support for efficient operations on more esoteric containers. For example, broadcasting over a sparse array can avoid unnecessarily traversing all of the zero entries:
julia> function f_loop!(y, x)
for i in eachindex(y)
@inbounds y[i] = x[i]
end
end
f_loop! (generic function with 1 method)
julia> x = sprand(100, 100, 0.001);
julia> @btime y .= $x setup=(y = similar(x));
37.082 ns (0 allocations: 0 bytes)
julia> @btime f_loop!(y, x) setup=(y = similar(x))
60.394 μs (0 allocations: 0 bytes)
There’s an efficient way to iterate over only the nonzeros in a sparse array, but broadcasting is smart enough to do that for you.
I’m not entirely sure what you mean by “preferable” (some options: more performant, considered better style, etc.), but with loops please do note that the @simd annotation can be used to indicate to the compiler that the loop iterations are independent and can be reordered.
Note also that, even if they overlap almost exactly for functions of a single argument, they can mean different things for functions of two or more arguments.
I figured out that the loop version was using SIMD instructions once I applied @inbounds. I believe @simd wasn’t necessary because there is no reduction aspect of the problem I chose as an example.