Optimization of array views

aashish01 · August 8, 2018, 4:27pm

Consider the following code

A = rand(1000,20);
res = zeros(size(A,1));

@time for i in 1:size(A,1)
	p = view(A, i, :)
	res[i] = sum(p)
end
0.004405 seconds (21.94 k allocations: 436.531 KiB)

Note the number of allocations and memory usage From what I understand the view costs some memory to be allocated and is being allocated in every iteration. But if view is an immutable struct then why is it not being optimized / reused in every iteration?

PS: I am aware of taking sum along a dimension, using sum here only as an example.

tomaklutfu · August 8, 2018, 4:30pm

Did you try putting them into a function?

aashish01 · August 8, 2018, 4:34pm

function h()
	A = rand(1000,20);
	res = zeros(size(A,1));

	for i in 1:size(A,1)
		p = view(A, i, :)
		res[i] = sum(p)
	end
	res
end

@time h();
0.000078 seconds (1.01 k allocations: 211.297 KiB)

@time sum(A,2);
  0.000052 seconds (10 allocations: 8.188 KiB)

The results above are after warmup of code. Its faster but 1k allocations remain.

stillyslalom · August 8, 2018, 4:56pm

One issue’s that your function h() includes the time and allocations needed to generate the random array. Rewriting to accept A as an argument cuts the time in half. Views are still expensive, though: you’re allocating at each iteration of your inner loop for an otherwise-cheap summation.

using BenchmarkTools
A = rand(1000, 20)

@btime h1()       # 53.892 μs (1003 allocations: 211.14 KiB)
@btime h2($A)     # 25.247 μs (1001 allocations: 54.81 KiB)
@btime sum($A, 2) # 4.455 μs (6 allocations: 8.03 KiB)

aashish01 · August 8, 2018, 5:08pm

Sorry for the incorrect example shared above. I was actually trying to understand why are view objects being allocated in each loop. They are defined as immutable structs with fixed size. So shouldn’t the compiler be able to reuse the same space allocated in the previous iteration with new parameters?



using BenchmarkTools
function m()
	A = rand(1000000,20);
	res = zeros(size(A,1));

	for i in 1:size(A,1)
		for j in 1:size(A,2)
			res[i] += A[i,j]
		end
	end
	res
end

@btime m();
78.411 ms (4 allocations: 160.22 MiB)

function h()
	A = rand(1000000,20);
	res = zeros(size(A,1));

	for i in 1:size(A,1)
		p = view(A, i, :)
		res[i] = sum(p)
	end
	res
end

@btime h();
122.569 ms (1000004 allocations: 205.99 MiB)

function p()
	A = rand(1000000,20)
	res = sum(A,2)
end

@btime p();
66.007 ms (9 allocations: 160.22 MiB)

stillyslalom · August 8, 2018, 5:29pm

rand(1000000, 20) dominates the runtime of all your functions, so I’d highly recommend passing A as an function argument if you’re trying to isolate the performance of another part of the code.

On allocations, see here. Per Stefan, “Being able to stack-allocate objects that refer to the heap is an important case that we need to address, but doing so is non-trivial and hasn’t been done yet.”

See also #14955.

Elrod · August 8, 2018, 7:42pm

Try


julia> @inline function isum(x)
           out = zero(eltype(x))
           @simd for xᵢ ∈ x
               out += xᵢ
           end
           out
       end
isum (generic function with 1 method)

julia> using BenchmarkTools

julia> @benchmark h2()
BenchmarkTools.Trial: 
  memory estimate:  164.27 KiB
  allocs estimate:  3
  --------------
  minimum time:     23.534 μs (0.00% GC)
  median time:      26.699 μs (0.00% GC)
  mean time:        37.349 μs (24.13% GC)
  maximum time:     48.556 ms (99.83% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark h()
BenchmarkTools.Trial: 
  memory estimate:  211.14 KiB
  allocs estimate:  1003
  --------------
  minimum time:     31.939 μs (0.00% GC)
  median time:      36.909 μs (0.00% GC)
  mean time:        52.521 μs (27.73% GC)
  maximum time:     42.334 ms (99.78% GC)
  --------------
  samples:          10000
  evals/sample:     1

where h2 is simply h, using isum instead of sum.
If the function call gets inlined, Julia can elide the allocations. This was on 0.7.

If you can’t easily inline the functions you’re calling, try

Topic		Replies	Views
Avoiding allocations in `view`s General Usage	7	2220	July 4, 2019
Avoid allocations in retrieving array parts Performance question , array , views	2	52	October 29, 2024
Array views becoming dominant source of memory allocation Performance	31	5170	November 14, 2018
Is there a way to guarantee 0 allocations when accessing an array? General Usage	18	4306	June 1, 2017
Memory allocation while using @views Performance performance , arrays	5	1333	August 16, 2021

Optimization of array views

Related topics