What is the advantage of "foreach(v) do.." over "for i in v..."?

This

julia> let t = 0
           for x in v
               t += x
           end
           t
       end

vs

julia> let t = 0
           foreach(v) do x
               t += x
           end
           t
       end

It appears to me that for can do the same, is not more verbose, has simpler semantics. What’s the real use of the foreach do.. idiom then?

EDIT: Sorry, I should have mentioned that before posting, I had read the docs for for, foreach.
In case interation is over several containers:

julia> shorter =1:2; longer= 'a':'c';

julia> foreach(shorter, longer)  do x,y
           println(x," and ", y)
       end
1 and a
2 and b

julia> for (x,y) in zip(shorter,longer)
           println(x, " and ", y)
       end
1 and a
2 and b

julia> for i in 1:length(shorter)
           println(shorter[i], " and ", longer[i])
       end
1 and a
2 and b

1 Like

I think your examples show what the difference is. foreach is a version of map that does not return a result (where you are mapping for the side effects). Just type ?foreach in the REPL for an explanation.

1 Like

foreach can be more convenient if you already have a function that you want to apply to each element of an iterator. Internally it is simply implemented as the loop. A major disadvantage that it has it that in a scenario like yours, it allocates and therefore is significantly slower than a for loop. You have go through some hoops to fix this.

Here is an example showing that the foreach variant is much slower than a for loop unless one wraps the summation variable in a Ref.

function f1(itr)  # for loop
    t = 0
    for x in itr
        t += x
    end
    t
end

function f2(itr)  # foreach
    t = 0
    foreach(itr) do x
        t += x
    end
    t
end

function f3(itr)  # foreach with Ref
    t = Ref(0)
    foreach(itr) do x
        t[] += x
    end
    t[]
end
julia> using Chairmarks

julia> @b 1:1000 f1(_), f2(_), f3(_)
(20.000 ns, 25.360 μs (1459 allocs: 22.797 KiB), 20.000 ns)
6 Likes

One of the key differences I’m aware of is that foreach will unroll the loop when given a Tuple, which can help e.g. with type instabilities due to heterogeneous elements:

using BenchmarkTools

function square_all_for!(v::Tuple)
    for x in v
        x .*= 2
    end
end

function square_all_foreach!(v::Tuple)
    foreach(v) do x
        x .= x .^ 2
    end
end

function mean_tuple()
    return (
        ones(Float16, 2),
        ones(Float32, 2),
        ones(Float64, 2),
        ones(ComplexF16, 2),
        ones(ComplexF32, 2),
        ones(ComplexF64, 2)
    )
end
julia> @btime square_all_for!(_v) setup = (_v = mean_tuple());
  2.690 μs (24 allocations: 1.03 KiB)

julia> @btime square_all_foreach!(_v) setup = (_v = mean_tuple());
  99.101 ns (0 allocations: 0 bytes)
20 Likes

I try to use most specific tool I can for a job. for supports early exit via break; map/foreach doesn’t, so I prefer foreach (or something more specific like reduce) when I don’t need early exit.

3 Likes

foreach is a fairly bad fit with reduction operations, especially until the infamous issue performance of captured variables in closures · Issue #15276 · JuliaLang/julia · GitHub is solved. My favorite application is

x = rand(100)
# Print all of the elements.
foreach(println, x)
3 Likes

Similar use case. Suppose you have a collection of vectors and you want to sort each one. Then you can do foreach(sort!, vectors).

9 Likes

Thanks all for the answers.

More convenient because:

foreach(f, collection)

looks a bit shorter than:

for i in collection ; f(i) ; end 

?

2 Likes

Yes, that’s what I meant.

1 Like

AKA “rule of least power”.

1 Like

Another example, beside heterogeneous tuples, where foreach and friends (map, foldl etc.) can be much efficient is collections with complex iteration states.

I am writing a code where I have a kind of collection where iteration must unfold into several nested loops. So, implementing iterate means I have to carry on the state of each nested loop, and when I slap a filter on top, the compiler seems to give up and allocate dynamic memory.

foreach and friends on such types can be implemented in transducer style. The difference is explained in the Transducers.jl docs here: Comparison to iterators · Transducers.jl

Another (potential) advantage is that foreach does not specify the order of traversing the collection, so that in certain cases it may be implemented to take advantage of various compiler optimization and / or multi-threading.

8 Likes

I believe that’s still in TBD state. I.e., the documentation leaves it underspecified. Some Github issues:

4 Likes

Potential advantage it is then.

Yet, I’d prefer to have map and foreach with unspecified-order semantics proposed by the OP of the latter issue, maybe as dedicated implementations per @Tamas_Papp 's suggestion in the same issue, just to have a common name for developers to use when they want to implement such semantics.

Right now, I believe, collect(Iterators.map(f, x)) has a guaranteed execution order with no equivalent for foreach.

Actually, ordering of the output is also debatable. Right now I’m working on a data structure for parallel computations, and it has two notions of order, insertion order and iteration order (idea is, it can be treated as ordinary iterable as fallback). For practical purposes, insertion order is the “natural” order for mapping output, but if used in multiple-input map with a non-indexed other argument, there’s no other option than iteration order which messes everything up.