`for i = 1:M, j = 1:N` discrepancy (loop vs comprehension/generator)

I noticed the following about the for i = 1:M, j = 1:N syntax. When using this syntax as part of a normal for-loop, j iterates first:

julia> for i = 1:2, j = 1:3
       println((i, j))
       end
(1, 1)
(1, 2)
(1, 3)
(2, 1)
(2, 2)
(2, 3)

But in a comprehension/generator i iterates first:

julia> [println((i, j)) for i = 1:2, j = 1:3]
(1, 1)
(2, 1)
(1, 2)
(2, 2)
(1, 3)
(2, 3)

I get that for i = 1:M, j = 1:N ... end is shorthand for for i = 1:M for j = 1:N ... end end, but it’s a bit confusing to me that the shortened form works differently in a for-loop than in a comprehension, especially taking into account that in a comprehension I have to explicitly use the for i = 1:M for j = 1:N syntax to get j to iterate first. Why is there this difference in behavior?

To summarize, this is the current behavior:

  • for i = 1:M, j = 1:N ... end: j iterates first
  • for i = 1:M for j = 1:N ... end end: j iterates first
  • [... for i = 1:M, j = 1:N]: i iterates first
  • [... for i = 1:M for j = 1:N]: j iterates first

but I would expect:

  • for i = 1:M, j = 1:N ... end: i iterates first
  • everything else as is

Has this issue been discussed before? Does it bother anyone else, or is it just me? I’m curious to see what others think and to learn the rationale for the current behavior or whether the behavior I expect was even considered.

5 Likes

There is one more distinction worth making here: The ordinary for loop and the for for comprehension work the same way, and both allow the second (inner) loop to depend on the first (outer). The product comprehension differs not only in iteration order, but also in that it must have a rectangular iteration space:

julia> for i = 1:3, j = i:3  # 2nd is inner loop, may depend on 1st
         println((; i, j))
       end
(i = 1, j = 1)
(i = 1, j = 2)
(i = 1, j = 3)
(i = 2, j = 2)
(i = 2, j = 3)
(i = 3, j = 3)

julia> [println((; i, j)) for i = 1:3 for j = i:3]  # Iterators.flatten, 2nd is inner, may depend on 1st
(i = 1, j = 1)
(i = 1, j = 2)
(i = 1, j = 3)
(i = 2, j = 2)
(i = 2, j = 3)
(i = 3, j = 3)
6-element Vector{Nothing}:  # returns a vector, one triangle of matrix below
 nothing
 nothing
...

julia> [println((; i, j)) for i = 1:3, j = i:3]
ERROR: UndefVarError: i not defined

julia> [println((; i, j)) for i = 1:3, j = 1:3]  # Iterators.product, must be independent
(i = 1, j = 1)
(i = 2, j = 1)
(i = 3, j = 1)
(i = 1, j = 2)
(i = 2, j = 2)
(i = 3, j = 2)
(i = 1, j = 3)
(i = 2, j = 3)
(i = 3, j = 3)
3×3 Matrix{Nothing}:  # returns a matrix
 nothing  nothing  nothing
 nothing  nothing  nothing
 nothing  nothing  nothing

I do think this is confusing, not sure what alternatives were considered.I guess I only write for i = 1:3, j = 1:3 when I don’t care about the iteration order.

7 Likes

Yes, I had forgotten about this aspect. Though I’m not sure how much I care about this being different; if I use a comprehension with for i = 1:M, j = 1:N I think it’s clear I want a Matrix, whereas in a for-loop I’m not necessarily creating a Matrix, so I think I’m okay with this

working as is. But then it seems awkward/ambiguous to have i iterate first in that case (the output could be something like

(i = 1, j = 1)
(i = 2, j = 2)
(i = 3, j = 3)
(i = 1, j = 2)
(i = 2, j = 3)
(i = 1, j = 3)

which doesn’t seem right at first sight), so maybe I would prefer it to throw an error like in the comprehension case.

Well, if you think about it, the inner for always iterate first…

It is just that when you write fors before the content of the loop, the inner is the second one, and when you write them after, in the comprehension, the inner is the first one.

And whether or not you are in a compression, for x y is always translated to for x, for y


Hum… Now that I am on my computer and not on my phone anymore, I was able to tests some things and there is still something to discuss:

# The four following functions are equivalent : 
function loop1(inner_range,outer_range,fun)
    for o in outer_range
        for i in inner_range
            fun(o,i)
        end
    end
    return nothing
end
function loop2(inner_range,outer_range,fun)
    for o in outer_range, i in inner_range
        fun(o,i)
    end
    return nothing
end
function comp1(inner_range,outer_range,fun)
    [[fun(o,i) for i in inner_range] for o in outer_range]
    return nothing
end
function comp2(inner_range,outer_range,fun)
    [fun(o,i) for i in inner_range, o in outer_range]
    return nothing
end
# this one behaves differently : 
function comp3(inner_range,outer_range,fun)
    [fun(o,i) for i in inner_range for o in outer_range]
    return nothing
end
display("--loop1---------")
loop1([:i,:n],[:o,:u,:t],(x,y) -> display((x,y)))
display("--loop2---------")
loop2([:i,:n],[:o,:u,:t],(x,y) -> display((x,y)))

display("--comp1---------")
comp1([:i,:n],[:o,:u,:t],(x,y) -> display((x,y)))
display("--comp2---------")
comp2([:i,:n],[:o,:u,:t],(x,y) -> display((x,y)))
display("--comp3---------")
comp3([:i,:n],[:o,:u,:t],(x,y) -> display((x,y)))

outputs :

"--loop1---------"
(:o, :i)
(:o, :n)
(:u, :i)
(:u, :n)
(:t, :i)
(:t, :n)
"--loop2---------"
(:o, :i)
(:o, :n)
(:u, :i)
(:u, :n)
(:t, :i)
(:t, :n)
"--comp1---------"
(:o, :i)
(:o, :n)
(:u, :i)
(:u, :n)
(:t, :i)
(:t, :n)
"--comp2---------"
(:o, :i)
(:o, :n)
(:u, :i)
(:u, :n)
(:t, :i)
(:t, :n)
"--comp3---------"
(:o, :i)
(:u, :i)
(:t, :i)
(:o, :n)
(:u, :n)
(:t, :n)

So, the four first functions are behaving exactly as the logic i stated above said : the inner loop is always ran first.

But the last one, the for for comprehension is not. In fact, there is no equivalent loop :

function loop3(inner_range,outer_range,fun)
    for o in outer_range for i in inner_range
        fun(o,i)
    end
    return nothing
end

will not compile. Therefore, it is an outsider. The correct comprehension that corresponds to the loops are the [ ... for, for] and the [[... for ] for ] versions, the comprehesion [... for for] does not have a loop equivalent.

I see several ways out :

  • Either we fix the comp3 behavior so that is corresponds to others
  • We remove the comp3 possibility completely (my favorite).
  • We had a loop3 possibility that matches comp3 behavior
  • We do nothing and warn people in the docs on loop comprehensions (maybe the safest way).
4 Likes

I really prefer we do not remove [... for ... for ...] and end up without an easy way to do Vector comprehensions.

2 Likes

I’m not sure what you mean here. Loops and comprehensions do different things (loops don’t return anything, comprehensions return Arrays), so the equivalence I was talking about was strictly iteration order, in which case [... for for] corresponds to for for ... end end (loop1 in your example, but with o and i swapped).

1 Like

Oh, this might answer my confusion in my post above. But this brings up another debate: Do we read loops from where the code is outward, or do we read them left to right, top to bottom (like normal text)? I think the latter (reading code like text) makes more sense, in which case the inner loop is always the second to appear.

But then when dealing with comprehensions, because they make Arrays, it makes sense for the first loop to be the inner loop (in the for i, j case) because Arrays are stored column-major and we index arrays with A[i,j]. So even though that breaks my argument in the previous paragraph, it makes logical sense.
(EDIT: Though it doesn’t quite break my argument, because you can argue that there’s only one for, so what does inner vs outer mean in this case?)

And then if for i, j loops through i first in a comprehension, then it makes sense to have the same syntax loop through i first in the loop case. And then it’s easy to explain the (now consistent) difference: for i, j always loops through i first, and for i for j always loops through j first.

Well, I am not sure anymore what makes sense and what does not. This is a hard to pick on discussion.

This does not correspond to the results I have on comp1 and comp2.

This is not consistent with what i have in my results with loop2 and comp2.

Alltogether, i am not sure i am getting your point (neither i am getting mine lol)

I have been burned by this before, so I no longer use comprehensions this way.

3 Likes

Right, I was referring to a hypothetical scenario. (I will say here that I think your comp1 shouldn’t really be part of this discussion because it deals with nested comprehensions. I’m interested in the single comprehension case. Nested comprehensions would just apply the rules for a single comprehension to each comprehension independent of the others.)

That’s my point; I’m saying that your results with loop2 and comp2 don’t make sense (to me). In other words, I’m saying that if I write for i = 1:M, j = 1:N in my code, i should iterate first regardless of where that code appears (i.e., as a standalone for-loop or as a comprehension). Right now, if I write

for i = 1:M, j = 1:N
    println((; i, j))
end

j will iterate first, but if I write

[println((; i, j)) for i = 1:M, j = 1:N]

i will iterate first. I’m claiming this is a discrepancy because in both cases I am writing the same thing (for i = 1:M, j = 1:N). I am therefore suggesting that for both cases above i should iterate first.

2 Likes

This is a lot clearer now, thanks. I am now sure I disagree with your proposal: I like the inner first unconditional principle, but I guess this is just a matter of taste (and my taste is obviously not important).

  • What I pointed out is that comp3 is an exception to the inner first principle
  • What you point out is that for i, for j shoud not follows the inner first principle and always loop through i first instead, wether it is in a loop or in a list comprehension.

Am I summarizing right ?

1 Like

To me, everything else should follow the expanded nested form:

for i = 1:3
    for j = 1:3 
        @show (i,j)
    end
end

(i, j) = (1, 1)
(i, j) = (1, 2)
(i, j) = (1, 3)
(i, j) = (2, 1)
(i, j) = (2, 2)
(i, j) = (2, 3)
(i, j) = (3, 1)
(i, j) = (3, 2)
(i, j) = (3, 3)

I always get hit by the following and wonder why the order is reversed here?

[@show (i, j) for i = 1:3, j = i:3];
ERROR: UndefVarError: i not defined
Stacktrace:
 [1] top-level scope
   @ REPL[13]:1

The purpose of the for i, j form is to create a shaped array (2D in this case) and the i, j syntax is consistent with the array indexing syntax: first rows then columns. So the number of elements for j cannot depend on i. This syntax is inherently more restrictive than for ... for ....

The order of iterations in for i, j is consistent with the “right order” of iteration for most arrays: since Julia arrays are column-major by default, it’s generally best to iterate first on the row index (i.e. along a column). In particular, the Array created by array comprehension is indeed column-major, so using this order means Julia can fill the new array in the most efficient way. But as @mcabbott says it’s probably best to use this syntax only when the order doesn’t matter anyway.

For me the questionable choice is to allow for i, j outside of comprehensions/generators. It’s confusing indeed that the order is not the same as in a comprehension, and I wish the syntax was simply not allowed. But once you decide to allow it, it makes sense to use the lexical order of iteration because in this case i and j have no defined association with rows and columns, and the reverse-lexical order would be even more confusing…

2 Likes

I assume you meant for i, j, in which case I think your summary is correct.

But it saves a level of indentation :joy:

I don’t have a problem with allowing the syntax, but I can see why some might prefer not to have it. And I probably wouldn’t protest if it was removed.

I guess when I use for i, j outside of a comprehension I do have an Array that I am looping through, and so in my mind there is still the association with rows and columns. But you’re right, that certainly isn’t required to be the case.

Maybe I should just adopt for (i, j) in Tuple.(CartesianIndices(A)) if I have an Array, and otherwise only use for i, j if I don’t care about the order of iteration.

I haven’t read everything but the way to remember it is that for ... closer to the body of loop is the inner one right?

for x=, y=
   ...
   # y is inner
end
[... for x= , y=]  # x is inner

Well, yes, that is the right thing to remember. Then, the comprehension [... for i in inner_range for o in outer_range] is the only exception to this general behavior.

huh? in this case I see an abbreviated version of:

for i in
    for o in

so o is the inner loop, and it checks out

Well, run it or take a look at my expriments up on comp3 functions.

julia> [(i,j) for i in 1:2 for j in 5:6]
4-element Vector{Tuple{Int64, Int64}}:
 (1, 5)
 (1, 6)
 (2, 5)
 (2, 6)

j is the inner as expected since it’s same as:

for i in
    for j in