Memory allocation with `view` and array assignment

roflmaostc · August 27, 2021, 3:54pm

Hey,

I was wondering about this code:

julia> x = randn((2048, 2048));

julia> z = copy(x);

julia> using BenchmarkTools

julia> @btime $z[2:2000, 2:2000] = view($x, 2:2000, 2:2000);
  5.386 ms (0 allocations: 0 bytes)

julia> @btime $z[2:2000, 2:2000] .= view($x, 2:2000, 2:2000);
  3.195 ms (0 allocations: 0 bytes)

julia> @btime $z[2:2000, 2:2000] = view($z, 2:2000, 2:2000); # why allocations?
  9.731 ms (2 allocations: 30.49 MiB)

julia> @btime $z[2:2000, 2:2000] .= view($z, 2:2000, 2:2000); # why allocations?
  6.859 ms (2 allocations: 30.49 MiB)

I would have expected zero allocations for all of the examples, but why are there allocations if I assign the array to itself? Is it due to the reason that without a copy, the following

julia> @btime $z[2:2000, 2:2000] .= view($z, 1:1999, 1:1999);
  6.871 ms (2 allocations: 30.49 MiB)

julia> @btime $z[1:1999, 1:1999] .= view($z, 2:2000, 2:2000);
  6.872 ms (2 allocations: 30.49 MiB)

could fail because of overwriting?

Thanks for help!

Felix

tim.holy · August 27, 2021, 4:01pm

It’s due to the risk of aliasing. See ?Base.unalias for more details.

mbauman · August 27, 2021, 4:11pm

This actually looks like a bug. We have the smarts to see that the indices are matched between the LHS and the RHS (and thus is memory safe). I’m not sure why this simple case isn’t getting the fast-path, but a more complicated expression avoids the conservative copy:

julia> @btime $z[2:2000,2:2000] .= view($z, 2:2000, 2:2000);
  12.624 ms (2 allocations: 30.49 MiB)

julia> @btime $z[2:2000,2:2000] .= 0 .+ view($z, 2:2000, 2:2000);
  2.654 ms (0 allocations: 0 bytes)

julia> @btime $z[2:2000,2:2000] .= 0 .+ view($z, 1:1999, 1:1999);
  13.290 ms (2 allocations: 30.49 MiB)

Elrod · August 27, 2021, 4:47pm

If there’s a check that notices the source and destination arrays are equal, but so are the indices, couldn’t we make a fastest-path (no-op)?

Although custom array types would have the problem of the possiblity of specialized getindex and setindex! methods.

jling · August 27, 2021, 4:51pm

if an object is such that

obj[a] = obj[a] is not conceptually no-op… I think it’s doomed either way

mbauman · August 27, 2021, 4:54pm

I mean sure, there’s no limit to how far down you can go on the optimizations… but every one you make has a cost in the surface area that generic behaviors need to accommodate — and I’d say this is of fairly low utility. I’m guessing it was an MWE reduced down from something like x .= f.(x) (which is similarly missing the aliasing fast-path, perhaps because it has another “fast-path” it’s using).

Elrod · August 27, 2021, 5:03pm

Maybe some hacky use of MappedArrays where the inverse is only an approximation / not exact?
Maybe someone could implement an algorithm that iteratively converges.

(Obviously log(exp(x)) isn’t exact either, but I think people wouldn’t mind that turning into a no-op.)

I agree. The more checks like this that exist that can’t be skipped, the more likely I am to re-implement things when micro-optimizing better reflecting the problem at hand.

Here the costs would be in (a) the check itself and (b) forwarding the result of just returning to the function calling the check. So if one does add an extra check for if indices match, then it seems like additionally adding a “fastest-path” hopefully shouldn’t require much more code (unless forwarding the results is especially costly).

I think in general, no one should be doing a .= a if it doesn’t do anything.
But a .= b is common, and someone may have code that often uses internal caches, and is called both internally by their library, as well as by users. Then because in that particular case they might think exact aliasing is likely, it’d be easy enough to optimize manually with a === b || (a .= b).

anon56330260 · August 28, 2021, 4:41am

Maybe we could perform a runtime memory dependency test. It’s much more general than just handling this trivial case. It can also handle partially overlapping case and avoid copying. memmove does exactly this, though it only deals with one dimensional array…

roflmaostc · November 10, 2021, 1:57am

Now I encountered a real case where I didn’t expect allocation.

Isn’t that something we should fix?

julia> function f(x,y)
           x .= y.^2 .+ x.^2
       end
f (generic function with 1 method)

julia> function g(x,y)
           x .= view(y, 1:length(y)).^2 .+ view(x, 1:length(x)).^2
       end
g (generic function with 1 method)

julia> x, y = rand(Int, 2048*2048), rand(Int, 2048*2048);

julia> @time f(x,y);
  0.030055 seconds (15.08 k allocations: 717.336 KiB, 83.78% compilation time)

julia> @time f(x,y);
  0.005222 seconds

julia> @time g(x,y);
  0.076428 seconds (233.42 k allocations: 44.631 MiB, 77.89% compilation time)

julia> @time g(x,y);
  0.026629 seconds (2 allocations: 32.000 MiB)

Performance seems to be significantly different as well:

julia> x, y = rand(Int, 5000*10_000), rand(Int, 5000*10_000);

julia> @btime f($x, $y);
  53.750 ms (0 allocations: 0 bytes)

julia> @btime g($x, $y);
  209.211 ms (2 allocations: 381.47 MiB)

julia> x, y = rand(Int, 10*10), rand(Int, 10*10);

julia> @btime f($x, $y);
  49.186 ns (0 allocations: 0 bytes)

julia> @btime g($x, $y);
  114.352 ns (1 allocation: 896 bytes)

mbauman · November 10, 2021, 2:39am

That looks like an anti-aliasing false positive.

N5N3 · December 17, 2021, 6:59am

Well, @btime $z[2:2000, 2:2000] .= view($z, 2:2000, 2:2000); fallback to copyto!(a::AbstractArray, b::AbstractArray), where we always do unalias and performs no a === b check.

roflmaostc · December 17, 2021, 1:03pm

roflmaostc:

julia> function f(x,y)
           x .= y.^2 .+ x.^2
       end
f (generic function with 1 method)

julia> function g(x,y)
           x .= view(y, 1:length(y)).^2 .+ view(x, 1:length(x)).^2
       end
g (generic function with 1 method)

julia> x, y = rand(Int, 2048*2048), rand(Int, 2048*2048);

julia> @time f(x,y);
  0.030055 seconds (15.08 k allocations: 717.336 KiB, 83.78% compilation time)

julia> @time f(x,y);
  0.005222 seconds

julia> @time g(x,y);
  0.076428 seconds (233.42 k allocations: 44.631 MiB, 77.89% compilation time)

julia> @time g(x,y);
  0.026629 seconds (2 allocations: 32.000 MiB)

The copyto! doesn’t fix that, does it?

N5N3 · December 17, 2021, 1:19pm

This example seems unrelated with copyto!(A::AbstractArray, B::AbstractArray), as it fallback to copyto!(::AbstractArray, ::Broacasted{Nothing}). Since we only use a === b to avoid copy during self-inplace broadcast, you’d better avoid this pattern (‘g’) during usage.

Topic		Replies	Views
Unexpected Memory Allocation of Broadcasting Copy Performance question	4	129	June 15, 2024
Strange dot-syntax result New to Julia	17	461	October 6, 2020
Why making `a[range] = b[range]` seems to make unnecessary allocations? Performance	4	980	September 30, 2019
Avoiding allocations from views: copyto! between arrays of different shape General Usage	14	4014	October 31, 2018
Memory allocation while using @views Performance performance , arrays	5	1332	August 16, 2021

Memory allocation with `view` and array assignment

Related topics