It appears to me that for can do the same, is not more verbose, has simpler semantics. What’s the real use of the foreach do.. idiom then?
EDIT: Sorry, I should have mentioned that before posting, I had read the docs for for, foreach.
In case interation is over several containers:
julia> shorter =1:2; longer= 'a':'c';
julia> foreach(shorter, longer) do x,y
println(x," and ", y)
end
1 and a
2 and b
julia> for (x,y) in zip(shorter,longer)
println(x, " and ", y)
end
1 and a
2 and b
julia> for i in 1:length(shorter)
println(shorter[i], " and ", longer[i])
end
1 and a
2 and b
I think your examples show what the difference is. foreach is a version of map that does not return a result (where you are mapping for the side effects). Just type ?foreach in the REPL for an explanation.
foreach can be more convenient if you already have a function that you want to apply to each element of an iterator. Internally it is simply implemented as the loop. A major disadvantage that it has it that in a scenario like yours, it allocates and therefore is significantly slower than a for loop. You have go through some hoops to fix this.
Here is an example showing that the foreach variant is much slower than a for loop unless one wraps the summation variable in a Ref.
function f1(itr) # for loop
t = 0
for x in itr
t += x
end
t
end
function f2(itr) # foreach
t = 0
foreach(itr) do x
t += x
end
t
end
function f3(itr) # foreach with Ref
t = Ref(0)
foreach(itr) do x
t[] += x
end
t[]
end
One of the key differences I’m aware of is that foreach will unroll the loop when given a Tuple, which can help e.g. with type instabilities due to heterogeneous elements:
using BenchmarkTools
function square_all_for!(v::Tuple)
for x in v
x .*= 2
end
end
function square_all_foreach!(v::Tuple)
foreach(v) do x
x .= x .^ 2
end
end
function mean_tuple()
return (
ones(Float16, 2),
ones(Float32, 2),
ones(Float64, 2),
ones(ComplexF16, 2),
ones(ComplexF32, 2),
ones(ComplexF64, 2)
)
end
I try to use most specific tool I can for a job. for supports early exit via break; map/foreach doesn’t, so I prefer foreach (or something more specific like reduce) when I don’t need early exit.
Another example, beside heterogeneous tuples, where foreach and friends (map, foldl etc.) can be much efficient is collections with complex iteration states.
I am writing a code where I have a kind of collection where iteration must unfold into several nested loops. So, implementing iterate means I have to carry on the state of each nested loop, and when I slap a filter on top, the compiler seems to give up and allocate dynamic memory.
foreach and friends on such types can be implemented in transducer style. The difference is explained in the Transducers.jl docs here: Comparison to iterators · Transducers.jl
Another (potential) advantage is that foreach does not specify the order of traversing the collection, so that in certain cases it may be implemented to take advantage of various compiler optimization and / or multi-threading.
Yet, I’d prefer to have map and foreach with unspecified-order semantics proposed by the OP of the latter issue, maybe as dedicated implementations per @Tamas_Papp 's suggestion in the same issue, just to have a common name for developers to use when they want to implement such semantics.
Right now, I believe, collect(Iterators.map(f, x)) has a guaranteed execution order with no equivalent for foreach.
Actually, ordering of the output is also debatable. Right now I’m working on a data structure for parallel computations, and it has two notions of order, insertion order and iteration order (idea is, it can be treated as ordinary iterable as fallback). For practical purposes, insertion order is the “natural” order for mapping output, but if used in multiple-input map with a non-indexed other argument, there’s no other option than iteration order which messes everything up.