Performance of views vs. explicit index referencing

lmiq · July 6, 2020, 10:04pm

I spent some time trying to understand the advantages and disadvantages of using views in my code. The main reason for using views, in my case, was simplifying the code, by (as one example), having only one method to compute distances between two points in space, instead of several functions depending on type of data received (a vector and and an array, two vectors, two arrays). The use of views simplifies quite a lot the code, as the indexes of some arrays sometimes have to be passed through several functions until they are used.

However, I am noticing that using views leads to quite slower codes than passing the indexes of the elements of the vector one wants to consider.

An illustrative example is bellow, where I compute the sum of the distance between two arrays of 3D vectors. The use of views leads to the 10x slower code.

I am not sure if I have a question, but perhaps someone has something to say about this which might enlighten a better way to deal with these situations.

d(x,y,i,j) = sqrt( (x[i,1]-y[j,1])^2 + (x[i,2]-y[j,2])^2 + (x[i,3]-y[j,3])^2 )
function f1(x,y)
  dsum = 0.
  nx = size(x,1)
  ny = size(y,1)
  for i in 1:nx
    for j in 1:ny
      dsum = dsum + d(x,y,i,j)
    end
  end
  return dsum
end

d(x,y) = sqrt( (x[1]-y[1])^2 + (x[2]-y[2])^2 + (x[3]-y[3])^2 )
function f2(x,y)
  dsum = 0.
  nx = size(x,1)
  ny = size(y,1)
  for i in 1:nx
    for j in 1:ny
      dsum = dsum + d(@view(x[i,1:3]),@view(y[j,1:3]))
    end
  end
  return dsum
end

x = rand(1000,3)
y = rand(1000,3)

println(f1(x,y))

println(f2(x,y))

using BenchmarkTools

println(" With indexes: ")
@btime f1($x,$y)

println(" With views: ")
@btime f2($x,$y)

Result:

661731.9520584571
661731.9520584571
 With indexes:
  2.011 ms (0 allocations: 0 bytes)
 With views:
  19.450 ms (2000000 allocations: 122.07 MiB)

mbauman · July 6, 2020, 10:50pm

FWIW, this is greatly improved on Julia 1.5+

 With indexes:
  1.600 ms (0 allocations: 0 bytes)
 With views:
  6.563 ms (0 allocations: 0 bytes)

If you explicitly inline that d function, then you see the same performance:

julia> @inline d(x,y) = sqrt( (x[1]-y[1])^2 + (x[2]-y[2])^2 + (x[3]-y[3])^2 )
d (generic function with 2 methods)

julia> @btime f2($x,$y)
  1.502 ms (0 allocations: 0 bytes)
661146.8809553175

Edit: the core reason for this difference is that d(x, y, i, j) happens to land just under the inlining threshold (and automatically inlines), whereas d(x, y) is just above (thus needing the manual annotation). It’s a pretty interesting case since the two are effectively doing the same thing, but apparently Julia thinks the SubArray indexing is costlier.

lmiq · July 7, 2020, 12:07am

Fantastic. Thank you very much. Actually it became effectively faster than index passing with that (still in Julia 1.4). One more thing to learn.

Topic		Replies	Views
Faster alternate to @views for passing subarrays to functions Performance array , views	3	683	January 7, 2022
Why the performance varies greatly after @view? Performance performance	2	498	October 27, 2021
Passing sub-array by reference / performance Performance	18	4728	April 10, 2019
When should you use views? Performance	9	2943	June 10, 2019
Confused about performance interaction between @view and logical (Boolean) indexing Performance	7	1448	April 15, 2019

Performance of views vs. explicit index referencing

Related topics