Cartesian For Loop

Here are three (I, II, III) implementations.

import Random.seed! as seed!
import Random.shuffle as shuffle

seed!(1); # [Code I]
for i = shuffle(1:3)
    for j = shuffle(1:4)
        println("($i, $j)")
    end
end

seed!(1); # [Code II]
for i = shuffle(1:3), j = shuffle(1:4)
    println("($i, $j)") # its inner layer sequence varies!
end

seed!(1); # [Code III]
let I = shuffle(1:3), J = shuffle(1:4)
    @show I J;
    for i = I, j = J
        println("($i, $j)") # its inner layer sequence are the same.
    end
end;

In my intuition (and I believe it should be so), Code II should resolves into Code III. (cf. my intuition here).
But the current behavior implies as if Code II instead resolves into Code I, which I think is not understandable.

The style in Code I and Code II are very different—e.g., only one break is entailed to escape the for in Code II, whereas you need two in Code I.

In the first two examples, you call shuffle(1:4) creating a new iterator in each iteration of the i loop. In the last example you call shuffle(1:4) once and reuse same iterator. It’s as if

julia> print(collect(shuffle(1:4)))
[1, 2, 3, 4]
julia> print(collect(shuffle(1:4)))
[3, 1, 2, 4]
julia> print(collect(shuffle(1:4)))
[4, 3, 1, 2]

vs

julia> j=shuffle(1:4)
4-element Vector{Int64}:
 2
 1
 3
 4

julia> print(collect(j))
[2, 1, 3, 4]
julia> print(collect(j))
[2, 1, 3, 4]
julia> print(collect(j))
[2, 1, 3, 4]

collect calls included to clarify that you are iterating the same (or different) iterators, although the iterators are simply vectors here.

2 Likes

Sorry but I don’t think you’ve clarified my confusion. The collect is not relevant here.

julia> shuffle(1:3)
3-element Vector{Int64}:
 3
 1
 2

It’s already a vector.

I don’t think so.

In my way to understand it,

julia> for i = shuffle(1:9)
       println(i)
       end

8
1
3
9
2
6
4
7
5

is equivalent to

let I = shuffle(1:9)
    for i = I
        println(i)
    end
end

isn’t it?

If this logic is true, then it should extends to the Cartesian for loop in my first post. (But the behavior is distinct, I’m wondering this)

It’s the j loop that you are reinitializing for each i

1 Like

It really is, maybe we just need a louder example:

julia> struct PrintRange{R<:AbstractRange} r::R end

julia> PrintRange(r::R) where R<:AbstractRange = (println("hello, new range $(r)!"); PrintRange{R}(r))
PrintRange

julia> Base.iterate(pr::PrintRange, args...) = iterate(pr.r, args...)

julia> for i in PrintRange(1:2), j in PrintRange(3:4)
         println(i, j)
       end
hello, new range 1:2!
hello, new range 3:4!
13
14
hello, new range 3:4!
23
24

This is documented under Control Flow:

Multiple nested for loops can be combined into a single outer loop, forming the >cartesian product of its iterables:

julia> for i = 1:2, j = 3:4
           println((i, j))
       end
(1, 3)
(1, 4)
(2, 3)
(2, 4)

With this syntax, iterables may still refer to outer loop variables; e.g. for i = 1:n, j = 1:i is valid. However a break statement inside such a loop exits the entire nest of loops, not just the inner one.

PrintRange(3:4) executed for each i value (Code I behavior) because it could have depended on it, not together with PrintRange(1:2) in the header prior to all iterations as you expected (Code III behavior). Code I+II and Code III are semantically different, so 1) the results can differ if the inner loop’s call does not only depend on its arguments, as you demonstrated with a PRNG, and 2) if the call is expensive, Code III saves time at no cost. Common subexpression elimination may optimize Code I+II to an equivalent of Code III if the compiler recognizes the inner call is pure, but that’s not as reliable as us writing what we mean, and sometimes we really would like side effects like sleep.

There’s nothing that can be done about the break exiting all the loops of the same block; if it only exited the inner one, then there’s no way to exit any of the other loops. break-ing out of loops separately is already an option with multiple nested blocks. continue does work across the inner loop either way because it skips to the next iteration.

3 Likes

So, Code II does not resolve into Code III, nor does it resolve into Code I.
I got it. Thank you.

I’m not sure what you mean by resolve, but all 3 versions certainly parse differently, and the multiple blocks in Code II is only practically equivalent to the single block in Code I because only the innermost for-block has code for iterations which lacks a break. If you need to write code in between for blocks or need single-level breaks, then Code II is not usable. The merged for-loop syntax of Code II is really only useful for that all-level break when you can’t easily combine all of the possibly interdependent loop iterables into one big one.

Literal meaning: resolve into something := to gradually become or be understood as something.

The info you provided is very useful!
I think the present behavior is good.

That was completely an illusion.
The iterator is updated dynamically in each iteration.
This illustrates:

julia> d = Dict(i => 10i for i = 1:5)
Dict{Int64, Int64} with 5 entries:
  5 => 50
  4 => 40
  2 => 20
  3 => 30
  1 => 10

julia> for (k, v) = d
           println("k = $k, v = $v")
           if k == 4
               pop!(d, 3)
               d[2] *= -1
           end
           println(d)
       end
k = 5, v = 50
Dict(5 => 50, 4 => 40, 2 => 20, 3 => 30, 1 => 10)
k = 4, v = 40
Dict(5 => 50, 4 => 40, 2 => -20, 1 => 10)
k = 2, v = -20
Dict(5 => 50, 4 => 40, 2 => -20, 1 => 10)
k = 1, v = 10
Dict(5 => 50, 4 => 40, 2 => -20, 1 => 10)

julia> d = Dict(i => 10i for i = 1:5)
Dict{Int64, Int64} with 5 entries:
  5 => 50
  4 => 40
  2 => 20
  3 => 30
  1 => 10

julia> let d_const = d, d = copy(d)
           for (k, v) = d_const
               println("k = $k, v = $v")
               if k == 4
                   pop!(d, 3)
                   d[2] *= -1
               end
               println("d = $d")
           end
       end
k = 5, v = 50
d = Dict(5 => 50, 4 => 40, 2 => 20, 3 => 30, 1 => 10)
k = 4, v = 40
d = Dict(5 => 50, 4 => 40, 2 => -20, 1 => 10)
k = 2, v = 20
d = Dict(5 => 50, 4 => 40, 2 => -20, 1 => 10)
k = 3, v = 30
d = Dict(5 => 50, 4 => 40, 2 => -20, 1 => 10)
k = 1, v = 10
d = Dict(5 => 50, 4 => 40, 2 => -20, 1 => 10)

That’s not exactly true. From Interfaces · The Julia Language, on the Base.iterate function:

So in fact, an iterator in julia is not normally updated. What exactly happens with your Dict example, I don’t know. I think updating a Dict while you iterate over it is undefined.

2 Likes

I don’t see the connection between these two statements, but the code blocks you quoted are really equivalent, with you calling shuffle(1:9) only once for both versions.

The iteration state is just the index of the next entry in the underlying Memory to potentially return (if the slot is used). If you modify the Dict (inplace) during the iteration, this will now point to some other content.

julia> d = Dict("a" => 1, "b" => 2)
Dict{String, Int64} with 2 entries:
  "b" => 2
  "a" => 1

julia> d.keys
16-element Memory{String}:
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
 #undef
    "b"
 #undef
 #undef
 #undef
 #undef
    "a"
 #undef
 #undef

julia> iterate(d)  # state is index corresponding to "b", + 1
("b" => 2, 10)

julia> for (x, y) in d
           println("$x => $y")
           if y == 2  # in first iteration
               d["a"] = -1
               # The next valid slot iterate will encounter corresponds to "a".
               # The value here is now -1 instead of 1.
           end
       end
b => 2
a => -1

True, but unexpected things may happen:

julia> d = Dict('a':'j' .=> 1:10)
Dict{Char, Int64} with 10 entries:
  'f' => 6
  'g' => 7
  'a' => 1
  'c' => 3
  'd' => 4
  'e' => 5
  'h' => 8
  'i' => 9
  'j' => 10
  'b' => 2

julia> for (x,y) in d
           println("$x => $y")
           if y == 2
               d['k'] = 11
           end
       end
f => 6
g => 7
a => 1
c => 3
d => 4
e => 5
h => 8
i => 9
j => 10
b => 2
h => 8
j => 10
i => 9
k => 11
a => 1
c => 3
g => 7
b => 2
3 Likes

Indeed. Adding the extra entry here causes the underlying Memory to expand and reorder, so that the state index makes little sense any more. So obviously it’s indeed best to not modify the object you’re currently iterating over! :slight_smile:

For me, for and let share some common features, e.g.

julia> for _ = 1:1
           i = 1
           let i = 2
           end
           println(i)
       end
1

julia> for _ = 1:1
           i = 1
           let
               i = 2
           end
           println(i)
       end
2

julia> for _ = 1:1
           i = 1
           for i = 1:2
           end
           println(i)
       end
1

julia> for _ = 1:1
           i = 1
           for k = 1:2
               i = k
           end
           println(i)
       end
2

julia> for i = 1:5, j = i:5, k = i+j:5 # valid syntax
       end

julia> let i = 1, j = i+2, k = i+j # valid syntax
       end

But I wonder why julia don’t allow users to write the following code

local i = 1, j = i+2, k = i+j # define 3 local names initially

This way of writing it is only valid as part of a let block.
Just change it to

local i = 1
local j = i + 2
local k = i + j

I suspect it’s because it would introduce complications in the parser. It’s possible to declare several local variables without assigning a value to them:

local i, j, k

It’s possible to initialize a single one to a tuple:

local i = 1, 2
# i == (1, 2)

Even things like

j = 3
local i = 1, j == 2
# i == (1, false)

I suppose it could be made to work with multiple assignments in a local declaration, but it’s parsed slightly differently from a let. E.g. let i = 1, 2 does not create an i == (1, 2).

Since the head line of let is essentially a local declaration, why are they differently designed?

let i, j, k has a similar sense, e.g. this example

julia> for _ = 1:1
           a, b, c, x, y, z = 1, 2, 3, 7, 8, 9
           let z, y, x
               a, b, c, x, y, z = -1, -2, -3, -7, -8, -9
           end
           println("$a, $b, $c; $x, $y, $z")
       end
-1, -2, -3; 7, 8, 9

I have no idea. I only need let and local occasionally, and have never given it any thought.

1 Like

In fact you could have said “That’s not true”.
Here is an example:

julia> d = Dict(i => 10i for i = 1:5)
Dict{Int64, Int64} with 5 entries:
  5 => 50
  4 => 40
  2 => 20
  3 => 30
  1 => 10

julia> for (k, v) = collect(d)
           println("$k => $v")
           if k == 4
               empty!(d)
           end
           println(d)
       end
5 => 50
Dict(5 => 50, 4 => 40, 2 => 20, 3 => 30, 1 => 10)
4 => 40
Dict{Int64, Int64}()
2 => 20
Dict{Int64, Int64}()
3 => 30
Dict{Int64, Int64}()
1 => 10
Dict{Int64, Int64}()

We can infer that collect(d) is not re-evaluated.
My finding was not correct—as you pointed out—it’s undefined behavior.

At that time, I was thinking

  • The output sequence is a complete range, therefore I know that for i = shuffle(1:9) was evaluated only once and then settled.. Otherwise, if for i = shuffle(1:9) is dynamically updated in each iteration, then the chances that I can observe a complete range is very small.