Unexpected behaviour when using ˋrepeatˋ

Hello everyone,

I’m encountering some difficulties when using the repeat function and repeating arrays. My objective is creating an array of arrays, applying some function to it and getting the array back with its new value appended to it. In my case I’m trying to parallelyse the operations, but here I’m just showing the minimal working example of my problem.

julia> x_list = repeat([[0.25]], 3)
3-element Vector{Vector{Float64}}:
 [0.25]
 [0.25]
 [0.25]

julia> r_list = [2.9, 3.0, 3.1]
3-element Vector{Float64}:
 2.9
 3.0
 3.1

julia> for (x, r) in zip(x_list, r_list)
    push!(x, x[1]+r)
end

Ideally my output would be

julia> x_list
3-element Vector{Vector{Float64}}:
 [0.25, 3.15]
 [0.25, 3.25]
 [0.25, 3.35]

But thats not the case, instead I get

julia> x_list
3-element Vector{Vector{Float64}}:
 [0.25, 3.15, 3.25, 3.35]
 [0.25, 3.15, 3.25, 3.35]
 [0.25, 3.15, 3.25, 3.35]

When I test x_list[1] === x_list[2] and x_list[2] === x_list[3], I get true in both cases. So when ˋrepeatˋ creates the array, in my limited understanding, it sets every item as a pointer to the same array stored in memory so they are indistinguishable and my code doesn’t work as expected.

My question is: is this expected behaviour? I’ve tested repeat with integers and this doesn’t happen. Also, if it is expected, how can I solve my problem? In my specific implementation I still create an array with the initial condition, but the number of times I created is defined at runtime.

kinda, yeah, repeat creates a shallow copy:

julia> a=[[3]];

julia> b=copy(a);

julia> push!(b[1], 4);

julia> a
1-element Vector{Vector{Int64}}:
 [3, 4]

you can do something like this instead:

julia> as = [[0.25] for _ = 1:3];

julia> push!(as[1], 2);

julia> as
3-element Vector{Vector{Float64}}:
 [0.25, 2.0]
 [0.25]
 [0.25]
1 Like

By creating x_list = repeat([[0.25]], 3) all elements share the same memory location:

julia> x_list = repeat([[0.25]], 3)
3-element Vector{Vector{Float64}}:
 [0.25]
 [0.25]
 [0.25]

julia> x_list[1][1] = -1
-1

julia> x_list
3-element Vector{Vector{Float64}}:
 [-1.0]
 [-1.0]
 [-1.0]

This is a common gotcha; most commonly it crops up when you fill the same array into many locations. But repeating an array of arrays is exactly the same situation. It’s a general thing about the semantics of the language itself.

The docs for fill explicitly call this out; you might find the explanation there helpful.

Every location of the returned array is set to (and is thus === to) the value that was passed; this means that if the value is itself modified, all elements of the filled array will reflect that modification because they’re still that very value. This is of no concern with fill(1.0, (5,5)) as the value 1.0 is immutable and cannot itself be modified, but can be unexpected with mutable values like — most commonly — arrays. For example, fill([], 3) places the very same empty array in all three locations of the returned vector:

julia> v = fill([], 3)
3-element Vector{Vector{Any}}:
 []
 []
 []

julia> v[1] === v[2] === v[3]
true

julia> value = v[1]
Any[]

julia> push!(value, 867_5309)
1-element Vector{Any}:
 8675309

julia> v
3-element Vector{Vector{Any}}:
 [8675309]
 [8675309]
 [8675309]

To create an array of many independent inner arrays, use a comprehension instead. This creates a new and distinct array on each iteration of the loop:

julia> v2 = [[] for _ in 1:3]
3-element Vector{Vector{Any}}:
 []
 []
 []

julia> v2[1] === v2[2] === v2[3]
false

julia> push!(v2[1], 8675309)
1-element Vector{Any}:
 8675309

julia> v2
3-element Vector{Vector{Any}}:
 [8675309]
 []
 []
3 Likes

Is it? What’s to stop fill or repeat from calling copy? Or if that’s not enough, a pair of serialize/deserialize?

Yes, it’s true, they could do something different from what they currently are defined to do. And many folks have tried to make them do so.

The key language semantics that I’m thinking of here — that you’ll find everywhere — are:

  • An Array is an object that can be changed
  • The same object can happily have multiple names and can even be placed in multiple locations in another array or datastructure
  • Functions pass the object itself as the argument

Those are the fundamentals that are behind these behaviors. fill(x, ...) and repeat([x], ...) put that one thing that’s named x into multiple slots.

1 Like

I see! It works now, thank you!

That makes sense, thank you!