# Propose for a movable SubArray

As is known views themselves need memory to hold some information. The fact that views are often used as temporaries leads to unnecessary (de)allocations when the same view is used multiple times. Moreover, this behavior unfavors the use of views in scenarios when views are all of the same shape, only differ in offsets (think 2d convolution for example, and in general any moving window algorithms).

The situation can be mitigated if we can `set!` the offset of views (a.k.a. `SubArray`s), preferably with or without boundchecking:

``````move_unsafe!(x::SubArray, diff_in_cartian_offsets...) = x.offset1 = ???
move!(x::SubArray, diff_in_cartian_offsets...) = boundcheck && x.offset1 = ???
``````

So to iterate through column of an `x::Array{T,2}` I can just type

``````v = view(x, :, 1)
for i in 1:size(x)[end]
move!(x, 0, 1)
...use x...
end
``````

with the following benefits:

• v only get allocated once;
• no hand unrolling of loops like now when we find the overhead of `x[:,i]` unacceptable.

AFAIK this can not be non-breakingly implemented now since `SubArray`s are immutable.
So do you think a movable `SubArray` is the right way to go?

=====================================================
If you are not convinced of the performance hit here is a illustrative benchmark.

``````function conv2d_view(x::Array{T,2}, k::Int) where T
ni, nj = collect(size(x)) - k + 1
y = fill(zero(T), ni, nj)
km1 = k-1
for j in 1:nj
for i in 1:ni
y[i,j] = sum(view(x, i:i+km1, j:j+km1))
end
end
return y
end
function conv2d_loop(x::Array{T,2}, k::Int) where T
ni, nj = collect(size(x)) - k + 1
y = fill(zero(T), ni, nj)
km1 = k-1
for j in 1:nj
for i in 1:ni
for n in 0:km1
for m in 0:km1
y[i,j] += x[i+m, j+n]
end
end
end
end
return y
end

using BenchmarkTools
n = 1000
x = randn(n, n)
k = 3

julia> @benchmark conv2d_view(\$x, \$k)
BenchmarkTools.Trial:
memory estimate:  68.41 MiB
allocs estimate:  996179
--------------
minimum time:     25.539 ms (6.65% GC)
median time:      25.768 ms (6.88% GC)
mean time:        26.733 ms (9.59% GC)
maximum time:     100.454 ms (61.82% GC)
--------------
samples:          187
evals/sample:     1

julia> @benchmark conv2d_loop(\$x, \$k)
BenchmarkTools.Trial:
memory estimate:  7.62 MiB
allocs estimate:  175
--------------
minimum time:     16.466 ms (0.00% GC)
median time:      16.851 ms (0.00% GC)
mean time:        17.325 ms (2.45% GC)
maximum time:     64.073 ms (73.48% GC)
--------------
samples:          289
evals/sample:     1

julia> versioninfo()
Julia Version 0.7.0-DEV.4722
Commit 8a5f747 (2018-03-29 19:53 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: Intel(R) Core(TM) i7-6800K CPU @ 3.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
Environment:
``````

Did you try `v0.7` (recent master)? The compiler is becoming eerily really clever about memory management for small transient immutables.

FWIW, putting `@inbounds` on the loop function will likely make it perform significantly better.

I think (in 0.7) the `view` allocations are only elided if they are only used in functions that end up inlined.

post updated with a simple benchmark on the latest master

I did run that test. On my node I had (view vs loop) ~ (15ms vs 8ms). I just refrained from posting that for simplicity.

Just for what itâ€™s worth, the long-term goal with SubArray has been to improve GC support for immutables that contain mutable objects â€” that should make it able to be passed as a stack-allocated value and avoid this issue. Thatâ€™s https://github.com/JuliaLang/julia/pull/18632.

And then thereâ€™s also a potential for â€śmutatingâ€ť immutables â€” WIP: Make mutating immutables easier by Keno Â· Pull Request #21912 Â· JuliaLang/julia Â· GitHub.

I donâ€™t think making SubArray mutable is going to be the direction we want to take â€” itâ€™s not breaking now but in the future weâ€™ll have better support for immutables that could make it even faster. And then going back to immutable will be breaking.

6 Likes

Thank you for the explanation, and the links are immensely enriching!