Changing the values of a subvector returned by a function doesn't change the original vector.

If I modify a subselection af an array, the array itself is modified (I assume the subselection is a reference to the same values. This behaviour is expected.

1×4 Array{Int64,2}:
 1  2  3  4

julia> v[[1 2]] = [5 5]
1×2 Array{Int64,2}:
 5  5

julia> v
1×4 Array{Int64,2}:
 5  5  3  4

If I have a function that has as an argument a vector a , modifies a subvector of a and returns a subvector b of a , something unexpected happens:

  • the values of a are changed as expected
  • changing the values of b doesn’t change a anymore [not expected].
    Is it wanted?
1×4 Array{Int64,2}:
 1  2  3  4

julia> function f(a)
           a[[1 2]] = [6 6]
           a[[1 2]]
       end
f (generic function with 1 method)

julia> vv = f(v)
1×2 Array{Int64,2}:
 6  6

julia> v
1×4 Array{Int64,2}:
 6  6  3  4

julia> vv[1] = 1
1

julia> vv
1×2 Array{Int64,2}:
 1  6

julia> v
1×4 Array{Int64,2}:
 6  6  3  4

[] (getindex) creates a new array. Use SubArrays (view).

1 Like

These two lines look like they do at least partly the same thing, but they don’t. The second line is syntactic sugar for getindex, which creates a copy of the data at those indices.

The first line looks like it does getindex and then assigns something to the output of that, but that’s not the case. What you are seeing is sugar for setindex!(a, ...) which mutates the a at certain indices.

It would perhaps have been nice if the syntaxes were a bit more distinct, but I have no suggestion for how that could be done.

3 Likes

Thanks! Problem solved.
view was exactly what I needed.
I got confused cause I thought the behaviour was similar to numpy arrays, while it is only for certain features.

1 Like

:+1: NumPy array slices are views, whereas Julia’s are copies unless you use @view.

Shouldn’t slices [] return views like in NumPy to avoid allocations? This seems like low-hanging optimization that I would expect to be implemented in Julia. Or am I missing something?

Yes, firstly, it is a breaking change and secondly, it is entirely up to what you do with the slice later that determines if copying or not improves performance. It was decided to copy by default since that is the safer alternative (reduce aliasing) and in case the time copying turns out to be significant you can opt in to aliasing (view).

1 Like

Most of the time when I am using [] copying is unnecessary. The fact that NumPy chose to return views by default, suggests that my experience is shared by many. So… I’m inclined to think that if we do some empirical studies on the written julia code then we will conclude that views are better choice.

Going from memory here (since I’m bad ad searching github):

Back a few years ago, views were quite expensive, so there was a clear performance benefit to returning copies. But the way I understood it, the plan was to switch to slices as views once the performance was sorted out, and this was generally expected by most.

Gradually there was a change in opinion, and once views became fast, doubts crept in that switching over was a good idea after all. It would be massively breaking, and the performance benefits were mixed, and absolutely not conclusive.

The question whether to stay with copies or switch to views was very thoroughly discussed, and finally a decision was reached to stick with copies. If you search for keywords like ‘views’, ‘arraymageddon’, ‘arraypocalypse’, and a few other, you can probably find a lot of discussion about this.

2 Likes

This depends mostly on the objects, the implementation of views, and access patterns. As explained in the FAQ, copying data is not always bad.

Just to be clear, this (extensive) discussion has already been had and the decision made. It will not be changed.

3 Likes

Here is an issue discussion you could read about this question: https://github.com/JuliaLang/julia/issues/3701

1 Like