Avoiding allocations from views: copyto! between arrays of different shape

pablosanjose · October 30, 2018, 6:46pm

Say I have an n x m Matrix, and an n-length Vector that I want to copy to, say, column 2 of my matrix. My first try was to do copyto! using CartesianIndices.

m = rand(3,3)
v = rand(3)
copyto!(m, CartesianIndices((1:3, 2:2)), v, CartesianIndices(1:3))

But that fails. copyto! only works to copy a slice of an AbstractArray{T1,N} to another slice of an AbstractArray{T2,N}, i.e. between tensors with the same number of indices, despite the API very elegantly allows to express the more general case with N1 != N2 as above. What is the proper way to do this (for arbitrary N1, N2) in v1.0? Is there a reason why we would not want a more general copyto!?

pablosanjose · October 30, 2018, 7:03pm

Something like the following (slight modification of the version in Base) seems to work for me. It’s such a minor modification that I suspect somebody had a good reason not to do it…

function copyto!(dest::AbstractArray{T1,N1}, Rdest::CartesianIndices{N1}, src::AbstractArray{T2,N2}, Rsrc::CartesianIndices{N2}) where {T1,T2,N1,N2}
    checkbounds(dest, first(Rdest))
    checkbounds(dest, last(Rdest))
    checkbounds(src, first(Rsrc))
    checkbounds(src, last(Rsrc))
    src′ = unalias(dest, src)
    for (Is, Id) in zip(Rsrc, Rdest)
        @inbounds dest[Id] = src′[Is]
    end
    return dest
end

Sukera · October 30, 2018, 8:34pm

I might not understand your problem correctly, but is there a reason why e.g. X[2,:] .= Y does not work for you?

pablosanjose · October 30, 2018, 10:00pm

No, that works perfectly well, and it’s probably the truly Julian approach. The (small) problem is that, if I understand correctly, that approach allocates a view, and is therefore somewhat slower than a direct copy. Until we have something like #14955 I prefer to avoid views in my particular application. But thanks for pointing that out!

pablosanjose · October 30, 2018, 10:05pm

To illustrate

a = rand(3,3); b = rand(3,3);
f1(a,b) = a[1:2, 2:3] .= b[2:3, 1:2]
f2(a,b) = copyto!(a, CartesianIndices((1:2, 2:3)), b, CartesianIndices((2:3, 1:2)))

on v1.0.1

julia> @btime f1($a,$b);
  61.395 ns (2 allocations: 176 bytes)
julia> @btime f2($a,$b);
  15.850 ns (0 allocations: 0 bytes)

carstenbauer · October 31, 2018, 6:14am

Note, that it only allocates a view of you have a slice on the rhs of the assignment. For copying a full vector (or matrix), as in your original example, it should be efficient.

pablosanjose · October 31, 2018, 6:43am

Mmm, are you sure that’s always true? The following f1 also seems to allocate a view, although indeed the performance is now much closer to f2

a = rand(3,3); b = rand(2,2);
f1(a,b) = a[1:2, 2:3] .= b
f2(a,b) = copyto!(a, CartesianIndices((1:2, 2:3)), b, CartesianIndices((1:2, 1:2)))

julia> @btime f1($a,$b)
  22.696 ns (1 allocation: 64 bytes)
2×2 view(::Array{Float64,2}, 1:2, 2:3) with eltype Float64:
 0.408094   0.413918
 0.0414806  0.610879

julia> @btime f2($a,$b)
  17.146 ns (0 allocations: 0 bytes)
3×3 Array{Float64,2}:
 0.8406     0.408094   0.413918
 0.664731   0.0414806  0.610879
 0.0776809  0.810417   0.810417

carstenbauer · October 31, 2018, 6:48am

I’m sorry, what I said isn’t the full story. it is still creating a view for the lhs. Only no slices at all should be equally fast. (I’m on phone)

carstenbauer · October 31, 2018, 7:40am

Even in the case of fully overwriting a by b the copyto! approach is a bit faster:

julia> f2(a,b) = copyto!(a, CartesianIndices((1:3, 1:3)), b, CartesianIndices((1:3, 1:3)))
f2 (generic function with 1 method)

julia> f1(a,b) = a.=b;

julia> @btime f1($a, $b);
  17.313 ns (0 allocations: 0 bytes)

julia> @btime f2($a, $b);
  20.756 ns (0 allocations: 0 bytes)

I see no reason not to have a more general copyto!.

kristoffer.carlsson · October 31, 2018, 2:43pm

Ref https://github.com/JuliaLang/julia/pull/29868.

kristoffer.carlsson · October 31, 2018, 2:44pm

Your numbers suggest the opposite?

pablosanjose · October 31, 2018, 2:46pm

Wow, wait, what? Did you just elide the view allocation with @inline? Does it also work with a slice on the right side?

kristoffer.carlsson · October 31, 2018, 2:57pm

That PR makes the RHS slice a bit better:

julia> a = rand(3,3); b = rand(3,3);

julia> f1(a,b) = (a[1:2, 2:3] .= b[2:3, 1:2]; nothing)

Before

julia> @btime f1($a,$b);
  79.742 ns (2 allocations: 176 bytes)

After

julia> @btime f1($a,$b);
  64.168 ns (1 allocation: 112 bytes)

~~There is still an allocation happening because equality between AbstractArray and SubArray is not inlined.~~

Actually I am not sure why.

carstenbauer · October 31, 2018, 3:21pm

Apparently I confused myself with the order

pablosanjose · October 31, 2018, 3:28pm

It sure would be great if could work as efficiently as copyto!. Much simpler, and nicer, I would gladly use .= always and stop worrying about hidden allocations.

Topic		Replies	Views
What is the quickest way to move a subsection of a matrix? General Usage matrices	24	2118	November 26, 2018
Memory allocation with `view` and array assignment General Usage array , memory-allocation	12	1366	December 17, 2021
Why making `a[range] = b[range]` seems to make unnecessary allocations? Performance	4	1020	September 30, 2019
Proving that copyto! is allocation-free on a view, using AllocCheck General Usage views , allocations , alloccheck	7	220	October 15, 2024
Preallocating the result of getindex(...) General Usage question	8	1455	July 13, 2017

Avoiding allocations from views: copyto! between arrays of different shape

Related topics