Hello,
while trying to implement a kind of array where each element is computed on demand by a function (the goal is to avoid storing the full array – it’s a generalized version of what is discussed here), I encountered a curious case of mysterious allocation behavior, and need some advice for how to proceed. In short, my module works properly and allocation-free for some functions, but causes allocations with other functions that are only slightly more complex.
The following is my (minimal) implementation for this array type. The constructor ComputedArray
takes a function and a list of coordinate grids, such that if A=ComputedArray(f,xs,ys)
then A[i,j]==f(xs[i],ys[j])
etc for higher dimensions.
module TestComputedArrays
import Base: size, getindex
export ComputedArray
struct ComputedArray{F,C,T,N} <: AbstractArray{T,N}
fn :: F
coords :: C
end
function ComputedArray(fn, coords...)
N = length(coords)
argtypes = map(eltype, coords)
T = Base.return_types(fn, argtypes)[1]
ComputedArray{typeof(fn), typeof(coords), T, N}(fn, coords)
end
size(A::ComputedArray) = map(length, A.coords)
@generated function getindex(A::ComputedArray{F,C,T,N}, I::Vararg{Int,N}) where {F,C,T,N}
ex = :( A.fn() )
for d = 1:N
push!(ex.args, :( A.coords[$d][I[$d]] ))
end
return ex
end
end # module
Here is some example code showing the curious behavior:
include("TestComputedArrays.jl")
using TestComputedArrays
using BenchmarkTools
v(a,b,c,d) = hypot(a-c,b-d)
w(a,b,c,d) = 1/hypot(a-c,b-d)
xs = linspace(0,1,6)
ys = collect(xs)
vy = ComputedArray(v,ys,ys,ys,ys)
wy = ComputedArray(w,ys,ys,ys,ys)
vx = ComputedArray(v,xs,xs,xs,xs)
println("ARRAY ACCESS")
@btime $vy[1,2,3,4]
@btime $wy[1,2,3,4]
@btime $vx[1,2,3,4]
println("COLLECT")
@btime collect($vy)
@btime collect($wy)
@btime collect($vx)
which produces the following output (Julia 0.6.0-rc2):
ARRAY ACCESS
7.957 ns (0 allocations: 0 bytes)
23.622 ns (1 allocation: 16 bytes)
48.681 ns (2 allocations: 224 bytes)
COLLECT
13.458 μs (2 allocations: 10.30 KiB)
56.907 μs (1298 allocations: 30.55 KiB)
82.694 μs (2594 allocations: 293.80 KiB)
Observations:
- [vy] No allocations for the
v
function when the underlyingcoords
areArray
s. - [wy] One allocation for the
w
function, which is simply the reciprocal ofv
.w
itself is not allocating. - [vx] Two allocations if the underlying
coords
arelinspace
s instead ofArray
s, although the common wisdom is thatlinspace
is as performant asArray
. - The COLLECT part shows that the allocations are indeed happening for each array access.
@code_warntype wy[1,2,3,4]
reveals nothing suspicious, likewise forvx
.
The behavior seems to depend on the “complexity” of the underlying function. It also seems to depend on dimensionality, e.g. if I use a function with only two or three arguments (instead of four as in the example), then I find the array access to be non-allocating (but the linspace
s still cause allocations).
My questions:
- What causes these allocations? My guess is that it’s related to inlining. Correct?
- How can I avoid these allocations? I’m okay with forcing
coords
to beArray
s for simplicity, but it needs to be allocation-free even for “complex” functions. Perhaps the@generated function
approach for mygetindex
is the culprit, but AFAICT it produces very straightforward code for each dimensionalityN
.