Hello,
I tried to use OffsetArrays.jl
in my own (finite difference) codes and experienced a sizable slowdown.Here is a code that resembles my use cases.
function lap2(x,y)
n1,n2=size(x)
for i2=2:n2-1
for i1=2:n1-1
@inbounds y[i1,i2]=(x[i1+1,i2]-2x[i1,i2]+x[i1-1,i2])*x[i1,i2]*x[i1+1,i2+1]+
(x[i1,i2-1]-2x[i1,i2]+x[i1,i2-1])*x[i1-1,i2]*x[i1,i2+1]
end
end
end
function lap2o(x,y)
I1=indices(x,1)[2:end-1]
I2=indices(x,2)[2:end-1]
@unsafe begin
for i2 in I2
for i1 in I1
(y[i1,i2]=(x[i1+1,i2]-2x[i1,i2]+x[i1-1,i2])*x[i1,i2]*x[i1+1,i2+1]+
(x[i1,i2-1]-2x[i1,i2]+x[i1,i2-1])*x[i1-1,i2]*x[i1,i2+1])
end
end
end
end
using OffsetArrays,BenchmarkTools
x=rand(1001,2001);y=zeros(size(x));
ox=OffsetArray(copy(x),-100:900,-100:1900);oy=OffsetArray(copy(y),-100:900,-100:1900);
#warm up...
julia> @benchmark lap2o(ox,oy)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 4.339 ms (0.00% GC)
median time: 4.363 ms (0.00% GC)
mean time: 4.375 ms (0.00% GC)
maximum time: 6.494 ms (0.00% GC)
--------------
samples: 1140
evals/sample: 1
julia> @benchmark lap2(x,y)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 3.546 ms (0.00% GC)
median time: 3.581 ms (0.00% GC)
mean time: 3.594 ms (0.00% GC)
maximum time: 4.586 ms (0.00% GC)
--------------
samples: 1387
evals/sample: 1
The slowdown here is proportional to the one I am seeing for my codes. If I profile the two calls I see many calls to unsafe_setindex
and unsafe_getindex
in OffsetArrays
but not for the normal Arrays.
Is this slowdown expected/normal? Is is system specific or am I just missing something?
OffsetArrays
would clean up my codes and make them more readable if I can solve this slowdown issue.
Cheers!
julia> versioninfo()
Julia Version 0.6.0
Commit 9036443 (2017-06-19 13:05 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
WORD_SIZE: 64
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.9.1 (ORCJIT, sandybridge)