What is the quickest way to move a subsection of a matrix?

I think you misunderstand: since I don’t see this documented, this is something I don’t think I can rely on. Whether this works or not is not relevant, since if this is not part of the interface then it could silently stop working or introduce incorrect results.

Both the view and copyto! solutions suggested above allocate. Try this to avoid allocations:

copyto!(m, 1, m, 1 + size(m,1), (size(m,2) - 1) * size(m,1))

Or use a for loop.

1 Like

I totally get that, and I fully understand the concern: how can you copy overlapping memory portions without making an extra copy first? I’m actually not sure how copyto! does it. However I’d point out that the documentation of copyto! does not specify that source and destination have to be different, right?

@bennedich: nice one! Pity it only works for full columns, unlike the CartesianIndices approach.

I think it would be worthwhile to open an issue about clarifying this. I get the impression that it is supposed to work, eg

Also, currently copyto! uses the memmove C function for this, which does handle overlap.

1 Like

Yes, for other ranges, I’d just consider a for loop. Either loop over each individual element, or a loop of copyto!. It’s a bit frustrating how it’s sometimes so hard to avoid allocations for simple operations like this in Julia.

1 Like

Thanks everyone. I should have specified strictly non-allocating methods! Copyto seems like the way to go, I was imagining something like memmove existed but couldn’t find it.

Is it clear why copyto! with CartesianIndices allocates, unlike a simple for? It’s using a loop itself

Hey again,

I’ve checked that the bit that allocates when using copyto! with CartesianIndices is precisely the unalias part, which in this case needs to make a copy of m, I guess. The interesting question is then how can memmove (which gets called directly when using @bennedich’s copyto! solution) avoid any allocation, and whether we could do something similar, and avoid calling Base.unalias. I suspect it’s not completely trivial, as the critical difference is that memmove can exploit the fact that the memory chunk to be moved is contiguous, so it’s easy to avoid a copy, but when using CartesianIndices it’s a little bit more involved, I guess.

See also Understanding performance using `@btime` and `@code_warntype`, `@code_llvm`, etc, which appears to apply here.

You mean this?

julia> @btime copyto!($m, CartesianIndices((1:3, 1:2)), $m, CartesianIndices((1:3,2:3)));
  42.432 ns (1 allocation: 160 bytes)

julia> g(m) = copyto!(m, CartesianIndices((1:3, 1:2)), m, CartesianIndices((1:3, 2:3)))
g (generic function with 1 method)

julia> @btime g($m);
  42.460 ns (1 allocation: 160 bytes)

No, I mean the size check inside copyto! itself uses string interpolation. Factor that out into a separate function and it might get faster.

Ah, right! No, for some reason this doesn’t seem to allocate anything extra (in v1.1 at least)

julia> function risky_copyto!(dest::AbstractArray{T1,2}, Rdest::CartesianIndices{2},
                         src::AbstractArray{T2,2}, Rsrc::CartesianIndices{2}) where {T1,T2}
           ΔI = first(Rdest) - first(Rsrc)
           src′ = Base.unalias(dest, src)
           for I in Rsrc
               @inbounds dest[I + ΔI] = src′[I]

julia> g(m) = risky_copyto!(m, CartesianIndices((1:3, 1:2)), m, CartesianIndices((1:3, 2:3)))
g (generic function with 1 method)

julia> @btime g($m);
  37.906 ns (1 allocation: 160 bytes)

Right, the allocation came from the unalias, but I was wondering if the string interpolation might nevertheless be measurable.

It does seem that perhaps one could avoid the unalias call by using a branch that checks whether the first element of Rdest is within Rsrc; if so, do the copy in order of Iterators.reverse(Rsrc).


you may be can do like this:

julia> A=[1 2 3;1 2 3;1 2 3];

julia> B1=circshift(A,(0,2));

julia> B=hcat(B1[:,1:2],B1[:,2])
3×3 Array{Int64,2}:
 2  3  3
 2  3  3
 2  3  3

For unalias allocating: https://github.com/JuliaLang/julia/pull/26237.

Phew! Heady stuff. So under that PR we would have Base.mightalias(m, m) == false? (I might be missing the point altogether, this is a bit too advanced for me :-D)

My main takeaway from all of this is “just use a for loop” :smile:


That’s very true for every piece of Julia code, but it’s quite frustrating when one is forced to write nested loops to keep performance or use weird @views, copyto!, getindex, etc., instead of a simple m[:,1:2] = m[:,2:3] as in Fortran or MATLAB. Sometimes when I port some Fortran code to Julia with a lot of operations on array slices and vectorized functions, I end up with a more complex Julia code to achieve the same performance. I’m not sure if this is a very tough problem, e.g. because of using a GC or it’s the design of dynamic languages in general, but I’m sure that Julia is improving day after day. Can I expect this to be solved in a near major release, or am I too optimistic?


I’m not saying that people should just use a for loop because it’s faster in this case—there are view-based solutions of utility functions that are just as efficient. In this case I just think a for loop is easier to understand. Otherwise I have to look up the copyto! function to know how it works.

The way I see it Julia allows you to reason about what actually happens when you do m[:,1:2] = m[:,2:3]. It teaches you to think about the possible pitfall of overwriting what you want to copy. Other languages will isolate you from these pitfalls by making a copy always, so that you don’t shoot yourself in the foot. In Julia you actually have different options, which allow you to achieve much higher performance. Actually you can do m[:,1:2] = m[:,2:3] in Julia. It will make a copy, and work like in MATLAB or Mathematica. However Julians will not consider that an optimal choice simply because you have better, non-allocating options, as discussed here, unlike in many other high-level languages. Freedom is never bad!