# Calculating 2-point correlation of a large 3D array: view seems to make it worse?

I am trying to calculate the 2-point correlation of a large 3D BitArray (the data comes from a micro-CT scan of a porous structure, which I have already thresholded, hence each voxel is either 1 or 0). At the moment, I am using an array which has 512 elements in each definition, and that might get bigger.

My working definition of the 2-point correlation is f(d) = \frac{1}{n}\sum_{i=1}^n x_i x_{i+d} ; so at a particular voxel separation d, I apply an index shift to the array both forward and back, multiply the two slices, and sum over the elements.

The function I have written is:

function autocorr_eucl(bindat::BitArray, hmax)
# First, we need to get all the possible displacements below a certain distance, and arrange them by distance.
idict = Dict{Float64,Array{Tuple{Int64,Int64,Int64},1}}()
h_full = ceil(hmax * sqrt(3))
for hi in 1:h_full
indcombs = [(j, i, hi-i-j) for i in 0:hi, j in 0:hi if (i + j <= hi) ]
for ind in indcombs
dist = round(hypot(ind...), sigdigits=4)
if dist <= hmax
push!(idict[dist], ind)
else
idict[dist] = [ind]
end
end
end
end
ks = collect(keys(idict))
xs = sort(ks)
nd = length(xs)
ys = similar(xs)
ns = similar(xs)
odims = size(bindat)
osize = prod(odims)
ox, oy, oz = odims

# Now that we have all the possible displacements,
# we calculate the correlation at each distance.
# This is relatively easy to parallelize, so I do that.
@sync for i in 1:nd
@spawn begin
inds = idict[xs[i]]
ninds = length(inds)
ns[i] = ninds
totsize = 0.0
# smsize = prod(odims .- inds)
for ind in inds
nx, ny, nz = ind
ndims = odims .- ind
contrib = sum(view(bindat, 1:ox-nx, 1:oy-ny, 1:oz-nz) .*
view(bindat, 1+nx:ox, 1+ny:oy, 1+nz:oz))
# contrib = sum(bindat[1:ox-nx, 1:oy-ny, 1:oz-nz] .*
#     bindat[1+nx:ox, 1+ny:oy, 1+nz:oz])
nsize = prod(ndims)
totsize += nsize / osize
ys[i] += contrib
# ys[i] += contrib / nsize / ninds
end
ys[i] /= (totsize * osize)
end
end
return xs, ys
end


(I already added some thread parallelism here, to try and speed it up a bit, which did give me about a 3x speedup for 4 threads. I’m impressed it was so easy!)
In keeping with the performance tips in the Julia manual, I tried accessing the array slices with views instead of the original slice. (The key lines here are contrib = ...; the version with slices is commented out.) However, I find that it takes about 3x longer using array views than when I use the basic slices.

As a MWE,

exdat = rand(512, 512, 512) .< .95
@time autocorr_eucl(exdat, 5)
@time autocorr_eucl(exdat, 5)


yields
 2.221220 seconds (1.16 k allocations: 4.548 GiB, 40.11% gc time)
when I use simple slices, whereas when I use view I find
8.895422 seconds (683 allocations: 1.517 GiB, 5.77% gc time)

What’s going on? I can see that this improves the allocations a lot, but the function gets much slower. (And if there are other things I can do to improve this code, I’m open to suggestions.)

I’m not 100% certain I have this correct, but it looks like you’re trying to compute a distance histogram - given some collection of elements with coordinates, computing and binning their respective euclidean distances relative to each other in a pairwise fashion?

If so, it’s generally preferred to compute distances on the fly and increment counts in an integer array representing observations directly. Please let me know if I have it right, and I may be able to help further - we have some overlap in domain I think.

If I’ve understood you correctly, you’re looking at the first part of my code, which shouldn’t be the biggest performance bottleneck, I think. I am finding all the possible index shifts within a certain radius and binning them by distance, but that is just to facilitate a further calculation.

For a 2D case that I can actually visualize, here is what I am trying to do:

Given a BitArray like the following:

[0 0 1
1 1 0
1 1 1]


the two-point correlation is something like trying to estimate how often we can predict the value of a pixel’s neighbors based on only its own value; the way I have defined it, calculating the two-point correlation at distance 1 from the center value would be f(1)_{2,2}= \frac{1}{4}(a_{2,2}a_{2,1} + a_{2,2}a_{1,2} + a_{2,2}a_{2,3} + a_{2,2}a_{3,2}) = \frac{1}{4}(1*0 + 1*1 + 1*1 +1*0) = \frac{1}{2}
In order to do this programmatically for every point at the same time and generate statistics, for the above array, I am writing
f(1) = \frac{1}{12}(a_{1,1}a_{1,2} + a_{1,2}a_{1,3} + a_{2,1}a_{2,2} + a_{2,2}a_{2,3} + a_{3,1}a_{3,2} + a_{3,2}a_{3,3} + a_{1,1}a_{2,1} + a_{1,2}a_{2,2} \\+ a_{1,3}a_{2,3}+ a_{2,1}a_{3,1} + a_{2,2}a_{3,2} + a_{2,3}a_{3,3} )
f(1) = \frac{1}{12}(0 + 0 + 1 + 0 + 1 + 1 + 0 + 0 + 0 + 1 + 1 + 0) = \frac{5}{12}
and representing that in code as

corr = (sum(A[1:2, :] .* A[2:3, :]) + sum(A[:,1:2] .* A[:,2:3]))/12


Once I start looping over all possible index shifts and tying those to particular distances (since I want to calculate this measure separately for each distance at all possible distances), it winds up looking like this.

inds = Dict(1.0 => [(1,0), (0,1)]
nx, ny = 3, 3
for ind in inds[1.0]
xshift, yshift = ind
contrib = A[1:end-xshift, 1:end-yshift] .* A[1+xshift:end, 1+yshift:end]
correlations[1.0] += contrib / (nx-xshift) / (ny-yshift) / len(inds)
end


I anticipated that allocating the array slices was a bottleneck of the code, since I am allocating two slices for every index shift possible (which is a lot of possible shifts, once I calculate as far as 10 or 15 pixels of distance).