When should you use views?

I was wondering in which cases you should use views to avoid copy of data. The example I had in mind is

m = maximum(arr[4:end])
m = maximum(@view arr[4:end])

are those equivalent ?

The easiest way to answer your question is by using BenchmarkTools:

julia> using BenchmarkTools

julia> arr = rand(1000);

julia> @benchmark maximum($arr[4:end])
BenchmarkTools.Trial: 
  memory estimate:  7.94 KiB
  allocs estimate:  1
  --------------
  minimum time:     1.261 μs (0.00% GC)
  median time:      1.368 μs (0.00% GC)
  mean time:        2.344 μs (22.04% GC)
  maximum time:     3.117 ms (99.85% GC)
  --------------
  samples:          10000
  evals/sample:     10

julia> @benchmark maximum(@view $arr[4:end])
BenchmarkTools.Trial: 
  memory estimate:  48 bytes
  allocs estimate:  1
  --------------
  minimum time:     950.565 ns (0.00% GC)
  median time:      958.087 ns (0.00% GC)
  mean time:        986.823 ns (0.00% GC)
  maximum time:     2.879 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     23

So yes, @view is needed to avoid copying the data: the first version allocates ~8 KiB, which is about 1000*sizeof(Float64). Moreover, that saves about 20% execution time.

6 Likes

Why does a view allocate at all?

1 Like

a view is a SubArray{Float64,1,Array{Float64,1},Tuple{UnitRange{Int64}},true} type. It’s still an object that contains some information, that you have to store.

I understand that - I should have been more specific with the question: I am curious why it isn’t possible to make it stack allocated.

It would require a compiler change I believe: https://github.com/JuliaLang/julia/issues/14955

2 Likes

Thank you - that’s what I was looking for.

I’ve found that the allocating or not allocating situation with views is quite puzzling, sometimes I ran into situations where a view does allocate something, sometimes it seems that it doesn’t. It’s not clear to me when the former or the latter will happen.

It presumably has to do with inlining. There’s more information and explanation here.

4 Likes

Getting back to the original question: yes, using views are functionally equivalent to the copying square-bracket indexing excepting, of course, aliasing behavior — mutating views will change the original array. And yes, the performance of the two can be significantly different, but it’s not always clear which will win so benchmarking like tkluck did is the best way to test it.

There are a few things you can keep in mind, though, that can help you gain a bit of insight:

  • Indirections are slow. Modern processors can (in some senses) “look ahead” to see what data is coming down the pike and start moving it into the cache before you’ll need it. But they can only do this if the accesses are in a predictable pattern — and the best pattern is a completely sequential one.
  • Copying data is the most sure-fire way of getting your data laid out in a fast sequential chunk of memory. And copying data can be surprisingly fast.
  • Views with “simple” index types are significantly faster than those with arbitrary indices. “Simple” indices include integers, ranges (particularly unit ranges), and colons. In these cases, the memory locations are more predictably computed with arithmetic instead of being arbitrarily looked up.
  • Views with limited combinations of these “simple” indices can be passed to directly BLAS for super-fast linear algebra — you can check if your view’s layout is BLAS-compatible by checking if it isa StridedArray. If you end up doing any sort of linear algebra and your indexing pattern isn’t strided, then it’ll be totally worth it to make that copy.
  • The more you work with a given view (and the more you repeatedly access into it) the more likely it is that the cost of the indirections will outweigh the upfront cost of copying.
  • If the taking of the view itself is in a for loop and the work you’re doing on it doesn’t inline, then you may want to consider restructuring your loops to either take the views ahead of time, or copy the selected data into a pre-allocated buffer, or re-write the inner work into regular for loops to avoid the need for the view/copy altogether. And hopefully someday we’ll support passing stack-allocated views around.
10 Likes