Do for loops capture by reference or by value

For range for loops in julia, is the element, “elm” in the for loop “for elm in vec ... end” captured by reference or value? In the following test bellow seems in indicate that it is captured by value where I declare an array of two structs, one big and one small, and loop over them. The array with the “big struct” seems to run longer. Is this because in each iteration of the for loop, each struct is being copied to elm? If so, is there any way to have the for loop to capture the element by reference?

julia> struct BigStruct
       a0::Int; a1::Int; a2::Int; a3::Int; a4::Int; a5::Int; a6::Int; a7::Int; a8::Int; a9::Int;
       a10::Int; a11::Int; a12::Int; a13::Int; a14::Int; a15::Int; a16::Int; a17::Int; a18::Int; a19::Int;
       a20::Int; a21::Int; a22::Int; a23::Int; a24::Int; a25::Int; a26::Int; a27::Int; a28::Int; a29::Int;
       a30::Int; a31::Int; a32::Int; a33::Int; a34::Int; a35::Int; a36::Int; a37::Int; a38::Int; a39::Int;
       a40::Int; a41::Int; a42::Int; a43::Int; a44::Int; a45::Int; a46::Int; a47::Int; a48::Int; a49::Int;

       BigStruct(a::Int) = new(a)
       end

julia> struct SmallStruct
       a::Int
       end

julia> Base.:+(a::Int, b::BigStruct) = a + b.a0

julia> Base.:+(a::Int, b::SmallStruct) = a + b.a

julia> bigvec = [BigStruct(i) for i in 1:1000000]

julia> smallvec = [SmallStruct(i) for i in 1:1000000]

julia> function func(vec)
       i = 1
           for elm in vec
              i += elm
           end
           return i
       end

julia> @btime func(bigvec)
  7.361 ms (1 allocation: 16 bytes)
500000500001

julia> @btime func(smallvec)
  342.213 μs (1 allocation: 16 bytes)
500000500001

EDIT:

I compiled equivalent code in c++ explicitly using references and got similar results. It seems this has more to do with accessing non-sequential elements rather than with copying.

This concept does not exist in julia.

It’s a missing optimization at LLVM level.

No.

1 Like

To expand a bit on the answer from the previous poster: The statement for elm in vec will act like a sequence of assignment statements elm=vec[1], then elm=vec[2], and so on. If vec[1] is mutable, then any changes to elm are reflected in vec[1] (and same for 2, 3, etc). If vec[1] is immutable (as in your code), then changes made to elm do not affect vec[1]. This suggests that in the immutable case, Julia sets elm to a copy of vec[1], whereas in the mutable case, Julia sets elm to a pointer to vec[1]. However, there is no requirement that the compiled code actually operates this way; the compiler is allowed to implement this statement with or without copying as it sees fit, as long as the outcome respects the mutable/immutable distinction.

1 Like

If vec[1] is immutable it means you cannot rebind any of the fields in it but mutating a field of it will indeed change vec[1]. The part about whether you can change vec[1] or not has nothing to do with the assignment but of the type itself. The assignment works exactly the same way.

I strongly feel that trying to separate the cases of assignment to a variable on whether the right-hand side is mutable or not causes more confusion than clarity. For example, the statement " This suggests that in the immutable case, Julia sets elm to a copy of vec[1]" doesn’t make sense because why would you copy something immutable? It might be beneficial from the compilers point of view but not when thinking about the semantics of the language.

Looking at assignment as just a no-op that results in the LHS pointing to the RHS is clearer because there is no difference between mutability and immutability and thus doesn’t cause confusion like the first quote in this post.

See issue at https://github.com/JuliaLang/julia/issues/23127 and the doc fixes at https://github.com/JuliaLang/julia/pull/26447.

2 Likes

You are right; I explained this badly. Let me try again.

The statement for elm in vec is like a sequence of assignment statements elm=vec[1], elm=vec[2], etc. The consequence of such a statement is that elm and vec[i] are equal. Using field a0 of elm will yield exactly the same quantity as using field a0 of vec[i] in an expression, and so on for all the fields.

If the type of the object to which elm and vec[i] refer is mutable, then changing a field in one will also change the same field in the other. If the object is immutable (as in your case), then it is not possible to change a field (in either elm or vec[i]).

A subsequent statement of the form elm=<something else> will not affect vec[i] but rebinds elm from scratch regardless of whether the previous binding was to a mutable or immutable object. The same holds for updating assignment statements like elm+=<something else>. (This is different from C++.)

The compiler is allowed to either copy or not copy when it implicitly executes elm=vec[i] regardless of mutability, as long as the outcome respects the mutability/immutability distinction.

Better?

In other words and perhaps shorter, assigning does not implicitly copy (or assignment is a no-op which points the LHS to the RHS).

In other words, and perhaps shorter, assignments are no-ops that point the LHS to the RHS.

elm += <something else> is syntactic sugar for elm = elm + <something else> which is the same as tmp = elm + <something else>; elm = tmp which is a form we already know how to analyze.

It seems your whole point is that assignment doesn’t implicitly copy? The mutability / immutability distinction is not relevant to the semantics of assignment.

My main reason for being pedantic is that the Julia semantics of an assignment statement may be surprising to a new user whose background is either Matlab or C++. (For users with a previous background in Python, I guess all of this seems obvious.)

As an aside, allow me to point out that the explanation “assignments are no-ops that point the LHS to the RHS” does not cover all cases! E.g., if vec is of type Vector{Float64} and j is of type Int, then vec[1]=j must copy and must convert.

So I guess a succinct, correct definition might be:

The assignment statement “A=B”, where A is a name, points A to B in the sense that they are indistinguishable in all subsequent code until A is rebound or goes out of scope. Whether copying occurs is an orthogonal issue and in principle affects only performance, not semantics. The assignment statement A<substructure>=B points A<substructure> to B if A<substructure> is the same type or supertype of the type of B, else it calls the appropriate conversion and copy statement if they are not. Finally, the assignment statement “A(…)=B” defines a new named function, which entails too many rules to list here. Are there other cases?

1 Like

This is not assignment, this is setindex!(vec, j, 1) and is a function that can do whatever it wants.

Thank you kristoffer and Stephen for your detailed answers! I’m coming from a C++ background, so an assignment statement like A=B to me implies a copy. But from this conversion, it’s now my understanding that in Julia A=B simply “binds” A to B. Whether this implies a copy or capture by reference is dealer’s compiler’s choice. Thus, how elm captures each element in vec (for elm in vec ... end) is not guaranteed syntactically but rather is determined by the compiler at compile time (and on a side note, this also explains what I considered “weird” behavior in Python).

I noticed that in the Julia manual under “noteworthy differences from C/C++” that this is touched on in fifth bullet point: “Julia values are not copied when assigned or passed to a function. If a function …” But, the conceptual differences between assignment operations between C/C++ and Julia seem to be not really expounded upon. Perhaps a bullet point detailing these differences might be beneficial?

In julia, compare to

julia> big2 = rand(Int, 50*length(bigvec));
julia> bigv=view(big2, 1:50:length(big2));
julia> @btime sum($bigv)
  8.167 ms (0 allocations: 0 bytes)
julia> @btime func($bigvec);
  8.716 ms (0 allocations: 0 bytes)
julia> @btime func($smallvec);
  513.106 μs (0 allocations: 0 bytes)

The bigstruct is slower by a factor of 17 on my machine. We amplify reads by a factor of 8, because we only use 8 bytes / cacheline. I’d guess some fancy vector load instructions are responsible for making the good version so much faster.


In general, I find julia easier to understand if I just look at how I’d compile the code. Assembly is simple, C++ is very very complicated. If you come from C++, then you probably can translate from C++ into terribly unoptimized assembly in your head and compare to the following?

What happens on the assignment elm=v[i], with v::Vector{T} depends on the type T. Suppose, for simplicity, that we are typestable and T is concrete (not a union or abstract type). There are three cases:

(1) T is a bitstype.
(2) T is immutable but not isbits
(3) T is mutable

In case (2) and (3), the vector v actually stores pointers to heap-allocated T instances. The assignment loads the pointer, into a pointer-sized stack-slot called elm.

In case (1), the vector v actually stores the structs. The assignments performs a memcopy from the memory region in v into a sizeof(T)-sized stack-slot called elm.

If you are lucky, then llvm will figure out that you only use a single value of this stack slot elm, plan only for a smaller stack-slot and copy only the values it needs. If you are very lucky, then it won’t need a stack-slot at all, but instead use a register or even a vector-load. I am not so sure what affects these optimizations (e.g. stack unwinding / exception handling) and I have seen julia store stuff into the stack that imho should not go there. So this is sometimes unreliable.

1 Like