Unexpected change of field value in struct

I was not expecting the change in the field values in the following example. Is this a recent change?

julia> x = [10., 10.]
2-element Vector{Float64}:
 10.0
 10.0

julia> @kwdef struct Test
        a =  x
        b =  x
       end
Test

julia> T = Test()
Test([10.0, 10.0], [10.0, 10.0])

julia> a = T.a
2-element Vector{Float64}:
 10.0
 10.0

julia> a[1] = 1.
1.0

julia> Test()
Test([1.0, 10.0], [1.0, 10.0])

I also find that using copy(x) has unexpected behaviour:

julia> x = [10., 10.]
2-element Vector{Float64}:
 10.0
 10.0

julia> @kwdef struct Test
        a =  copy(x)
        b =  x
       end
Test

julia> T = Test()
Test([10.0, 10.0], [10.0, 10.0])

julia> a = T.a
2-element Vector{Float64}:
 10.0
 10.0

julia> a[1] = 1.
1.0

julia> Test()
Test([10.0, 10.0], [10.0, 10.0])

julia> b = T.b
2-element Vector{Float64}:
 10.0
 10.0

julia> b[1] = 1.
1.0

julia> Test()
Test([1.0, 10.0], [1.0, 10.0])

julia> x = [10., 10.]
2-element Vector{Float64}:
 10.0
 10.0

julia> Test()
Test([10.0, 10.0], [10.0, 10.0])

It’s always been like that. Writing

@kwdef struct Test
    a = x
    b = x
end

Means Test instances initialize both fields with the value x.

Since x is a reference to an array, then .a and .b are also references to the same array, so they’re not just equal, but identical:

julia> t1 = Test()
Test([10.0, 10.0], [10.0, 10.0])

julia> x === t1.a === t1.b
true

Hopefully that also explain the copy.

Looking at the expansion of @kwdef can also provide some insight:

julia> @macroexpand @kwdef struct Test
           a = copy(x)
           b = x
       end
quote
    #= util.jl:609 =#
    begin
        $(Expr(:meta, :doc))
        struct Test
            #= REPL[25]:2 =#
            a
            #= REPL[25]:3 =#
            b
        end
    end
    #= util.jl:610 =#
    function Test(; a = copy(x), b = x)
        #= REPL[25]:1 =#
        Test(a, b)
    end
end

The key is that copy(x) and x become the default values of a method. So copy get called every time the method gets invoked.

I’m not sure I understand why this is expected. The behaviour is different if x is a scalar:

julia> y = 1.0
1.0

julia> @kwdef struct TestScalar
        a = y
        b = y
       end
TestScalar

julia> T = TestScalar()
TestScalar(1.0, 1.0)

julia> a = T.a
1.0

julia> a = 2.
2.0

julia> TestScalar()
TestScalar(1.0, 1.0)

julia> 

Moreover, from the definition of copy(x) it is not a reference:

copy(x)

  Create a shallow copy of x: the outer structure is copied, but not all internal values. For example, copying an array produces a new array with identically-same elements as the original.

An example:

julia> x = [10., 10.]
2-element Vector{Float64}:
 10.0
 10.0

julia> y = copy(x)
2-element Vector{Float64}:
 10.0
 10.0

julia> z = x
2-element Vector{Float64}:
 10.0
 10.0

julia> x[1] = 2.
2.0

julia> x
2-element Vector{Float64}:
  2.0
 10.0

julia> y
2-element Vector{Float64}:
 10.0
 10.0

julia> z
2-element Vector{Float64}:
  2.0
 10.0

The change in z is expected.

Edit: I am not changing the original Array in the first example, only the value of a new variable that is created from a struct field. This struct is supposed to be immutable.

I see that there might be some complications when the default field values are Arrays that are defined elsewhere, because this is a macro, but I still consider it unexpected behaviour.

  1. It seems to happen only with Arrays
  2. The struct is supposed to be immutable
  3. Copying has inconsistent behaviour. Field a doesn’t change when an element of variable a changes, but it does change when an element of variable b changes that is equal to field b and not a.

It seems to happen only with Arrays

Sort of. This is explained in the documentation (though it should be much more prominent)

In Julia, all arguments to functions are passed by sharing (i.e. by pointers). Some technical computing languages pass arrays by value, and while this prevents accidental modification by callees of a value in the caller, it makes avoiding unwanted copying of arrays difficult. By convention, a function name ending with a ! indicates that it will mutate or destroy the value of one or more of its arguments (compare, for example, sort and sort!). Callees must make explicit copies to ensure that they don’t modify inputs that they don’t intend to change. Many non-mutating functions are implemented by calling a function of the same name with an added ! at the end on an explicit copy of the input, and returning that copy.

1 Like

This is because the scalar is an immutable value. When the binding is to an array, what is immutable is the reference to the array. You cannot bind the field of the struct to another array, but you can change the element of the array. When the field is bound to a scalar, you cannot change the value, period, because the scalar is not a reference to mutable object, it is just the value, which you declared to be immutable by the definition of your struct.

It happens with any mutable object:

julia> mutable struct A
           x
       end

julia> a = A(1.0)
A(1.0)

julia> @kwdef struct B
           x = a
           y = a
       end
B

julia> b = B()
B(A(1.0), A(1.0))

julia> a.x = 2.0
2.0

julia> b
B(A(2.0), A(2.0))
3 Likes

The main issue is not that I modify the initial x, which modifies the field values (although I consider this unexpected as well, since the struct is immutable).

I create a variable from the field value of a, I modify one element and both a and b change. Even if fields a and b are references to Array x, the variables are references to one of the two fields. I don’t expected the other field to change.

Thank you, that clarifies the change in the fields when the initial Array changes. I just need to take that into account.

It’s the very first subsection of the “Functions” chapter.

Basically, this thread seems to be an instance of the common assignment vs. mutation confusion, since argument passing is effectively just assignment.

2 Likes

To sum up (correct me if I’m wrong):

  1. Changing the initial Array changes the field values because of the assignment
  2. Creating a variable introduces another assignment and a = T.a = x which also changes T.b since T.b =x
  3. copy(x) doesn’t create an assignment, so changing a doesn’t change T.b since a = T.a and T.b = x. However, changing b changes both T.a and T.b since b = T.b = x and every time I create a new instance of Test, T.a copies the new value of x.

By itself, a = T.a = x doesn’t “change” T.b, even though T.b = x, it just makes a and T.a “point” to the same array x as T.b.