What is the quickest way to move a subsection of a matrix?

Raf · November 26, 2018, 2:58am

For example copy the second two columns to the first:

[1 2 3;
 1 2 3;
 1 2 3]

becomes

[2 3 3;
 2 3 3;
 2 3 3]

pablosanjose · November 26, 2018, 7:50am

How about m[:, 1:2] .= view(m, :, 2:3)?

pablosanjose · November 26, 2018, 7:58am

Correction: copyto!(m, CartesianIndices((1:3, 1:2)), m, CartesianIndices((1:3,2:3))) is faster:

julia> @btime copyto!($m, CartesianIndices((1:3, 1:2)), $m, CartesianIndices((1:3,2:3)));
  39.121 ns (1 allocation: 160 bytes)

julia> f(m) = m[:,1:2] .= view(m, :, 2:3); @btime f($m);
  68.940 ns (4 allocations: 320 bytes)

Tamas_Papp · November 26, 2018, 7:59am

It is never clear to me whether the two regions of copyto! or .= can overlap, so I generally avoid this.

pablosanjose · November 26, 2018, 8:02am

Interesting! I believe copyto! takes care of unaliasing source and destination. Can you give an example where overlaps are a problem?

Tamas_Papp · November 26, 2018, 8:22am

I think you misunderstand: since I don’t see this documented, this is something I don’t think I can rely on. Whether this works or not is not relevant, since if this is not part of the interface then it could silently stop working or introduce incorrect results.

bennedich · November 26, 2018, 8:43am

Both the view and copyto! solutions suggested above allocate. Try this to avoid allocations:

copyto!(m, 1, m, 1 + size(m,1), (size(m,2) - 1) * size(m,1))

Or use a for loop.

pablosanjose · November 26, 2018, 8:45am

I totally get that, and I fully understand the concern: how can you copy overlapping memory portions without making an extra copy first? I’m actually not sure how copyto! does it. However I’d point out that the documentation of copyto! does not specify that source and destination have to be different, right?

@bennedich: nice one! Pity it only works for full columns, unlike the CartesianIndices approach.

Tamas_Papp · November 26, 2018, 8:49am

I think it would be worthwhile to open an issue about clarifying this. I get the impression that it is supposed to work, eg

github.com/JuliaLang/julia

copy! is wrong for overlapping views

opened 03:16PM - 16 Jan 17 UTC

closed 11:33AM - 23 Jun 20 UTC

stevengj

bug

The `copy!` function has special-casing to handle the case where the source and …destination arrays are equal, but this machinery does not work for views: ```jl julia> a = [1:10;]; copy!(a, 2, a, 1, 9) 10-element Array{Int64,1}: 1 1 2 3 4 5 6 7 8 9 julia> a = [1:10;]; v = view(a, 1:10); copy!(v, 2, v, 1, 9) 10-element SubArray{Int64,1,Array{Int64,1},Tuple{UnitRange{Int64}},true}: 1 1 1 1 1 1 1 1 1 1 ```

Also, currently copyto! uses the memmove C function for this, which does handle overlap.

bennedich · November 26, 2018, 8:57am

Yes, for other ranges, I’d just consider a for loop. Either loop over each individual element, or a loop of copyto!. It’s a bit frustrating how it’s sometimes so hard to avoid allocations for simple operations like this in Julia.

Raf · November 26, 2018, 9:04am

Thanks everyone. I should have specified strictly non-allocating methods! Copyto seems like the way to go, I was imagining something like memmove existed but couldn’t find it.

pablosanjose · November 26, 2018, 9:23am

Is it clear why copyto! with CartesianIndices allocates, unlike a simple for? It’s using a loop itself

github.com

JuliaLang/julia/blob/eabc5de03131e780a8be3b58dd576f1007b9ce99/base/multidimensional.jl#L847


      
          function copyto!(dest::AbstractArray{T1,N}, Rdest::CartesianIndices{N},
                            src::AbstractArray{T2,N}, Rsrc::CartesianIndices{N}) where {T1,T2,N}
              isempty(Rdest) && return dest
              if size(Rdest) != size(Rsrc)
                  throw(ArgumentError("source and destination must have same size (got $(size(Rsrc)) and $(size(Rdest)))"))
              end
              checkbounds(dest, first(Rdest))
              checkbounds(dest, last(Rdest))
              checkbounds(src, first(Rsrc))
              checkbounds(src, last(Rsrc))
              src′ = unalias(dest, src)
              ΔI = first(Rdest) - first(Rsrc)
              if @generated
                  quote
                      @nloops $N i (n->Rsrc.indices[n]) begin
                          @inbounds @nref($N,dest,n->i_n+ΔI[n]) = @nref($N,src′,i)
                      end
                  end
              else
                  for I in Rsrc
                      @inbounds dest[I + ΔI] = src′[I]

pablosanjose · November 26, 2018, 10:16am

Hey again,

I’ve checked that the bit that allocates when using copyto! with CartesianIndices is precisely the unalias part, which in this case needs to make a copy of m, I guess. The interesting question is then how can memmove (which gets called directly when using @bennedich’s copyto! solution) avoid any allocation, and whether we could do something similar, and avoid calling Base.unalias. I suspect it’s not completely trivial, as the critical difference is that memmove can exploit the fact that the memory chunk to be moved is contiguous, so it’s easy to avoid a copy, but when using CartesianIndices it’s a little bit more involved, I guess.

tim.holy · November 26, 2018, 11:15am

See also Understanding performance using `@btime` and `@code_warntype`, `@code_llvm`, etc - #2 by tim.holy, which appears to apply here.

pablosanjose · November 26, 2018, 11:17am

You mean this?

julia> @btime copyto!($m, CartesianIndices((1:3, 1:2)), $m, CartesianIndices((1:3,2:3)));
  42.432 ns (1 allocation: 160 bytes)

julia> g(m) = copyto!(m, CartesianIndices((1:3, 1:2)), m, CartesianIndices((1:3, 2:3)))
g (generic function with 1 method)

julia> @btime g($m);
  42.460 ns (1 allocation: 160 bytes)

tim.holy · November 26, 2018, 11:20am

No, I mean the size check inside copyto! itself uses string interpolation. Factor that out into a separate function and it might get faster.

pablosanjose · November 26, 2018, 11:25am

Ah, right! No, for some reason this doesn’t seem to allocate anything extra (in v1.1 at least)

julia> function risky_copyto!(dest::AbstractArray{T1,2}, Rdest::CartesianIndices{2},
                         src::AbstractArray{T2,2}, Rsrc::CartesianIndices{2}) where {T1,T2}
           ΔI = first(Rdest) - first(Rsrc)
           src′ = Base.unalias(dest, src)
           for I in Rsrc
               @inbounds dest[I + ΔI] = src′[I]
           end
           dest
       end

julia> g(m) = risky_copyto!(m, CartesianIndices((1:3, 1:2)), m, CartesianIndices((1:3, 2:3)))
g (generic function with 1 method)

julia> @btime g($m);
  37.906 ns (1 allocation: 160 bytes)

tim.holy · November 26, 2018, 11:41am

Right, the allocation came from the unalias, but I was wondering if the string interpolation might nevertheless be measurable.

It does seem that perhaps one could avoid the unalias call by using a branch that checks whether the first element of Rdest is within Rsrc; if so, do the copy in order of Iterators.reverse(Rsrc).

yang · November 26, 2018, 12:17pm

you may be can do like this:

julia> A=[1 2 3;1 2 3;1 2 3];

julia> B1=circshift(A,(0,2));

julia> B=hcat(B1[:,1:2],B1[:,2])
3×3 Array{Int64,2}:
 2  3  3
 2  3  3
 2  3  3

kristoffer.carlsson · November 26, 2018, 2:01pm

For unalias allocating: https://github.com/JuliaLang/julia/pull/26237.

Topic		Replies	Views
Avoiding allocations from views: copyto! between arrays of different shape General Usage	14	4029	October 31, 2018
Why making `a[range] = b[range]` seems to make unnecessary allocations? Performance	4	981	September 30, 2019
Memory allocation with `view` and array assignment General Usage array , memory-allocation	12	1293	December 17, 2021
Preallocating the result of getindex(...) General Usage question	8	1389	July 13, 2017
Proving that copyto! is allocation-free on a view, using AllocCheck General Usage views , allocations , alloccheck	7	153	October 15, 2024

What is the quickest way to move a subsection of a matrix?

Related topics