# Populating a 3D array with discretely sampled scalar field

Hi,
I’m trying to populate a 3D array (predetermined size of about a million) with values of a `Float64` function on a cartesian grid.
I think the term for this is “discretely sampled volumetric data” in scientific visualization.
Anyway, I wrote 3 different ways to do it in Julia and measured the performance with `@time` but all of them were less than half the speed of an equivalent C code (`gcc -O2`).
I fiddled with my Julia code just enough to learn that some allocations need to be pulled out of the loop, that I need to be careful how I select a range of elements while avoiding slowdowns, etc. I basically checked through all of the points in Docs » Performance Tips that apply to me, but the speed isn’t that satisfying.

Perhaps this is the limit of Julia as far as populating an array is concerned…
Could anyone let me know if there’s a faster way? Thanks.

``````function fill_grid_1!(gridR::Array{Float64,3})
rVec = zeros(Float64, 3)
s = 1
@inbounds for k in 0:zRes
for j in 0:yRes
for i in 0:xRes
rVec[1] = x0 + i*dx
rVec[2] = y0 + j*dx
rVec[3] = z0 + k*dx
gridR[s] = R_5_0(norm(rVec))
#gridR[i+1,j+1,k+1] = R_5_0(norm(rVec))
s += 1
end
end
end
nothing
end
``````

Turning off bounds checking can help a lot in tight loops. Also, give `@simd` a try, but Julia defaults to `-O3` which should add it automatically.

Not sure why the first option is allocating so much more though. I would check to make sure it’s all okay. `@inbounds` might fix the allocations.

3 Likes

The memory allocation in `fill_grid_1!`, which really shouldn’t allocate in the loop, cries type instability and `code_warntype` confirms it. The problem is that the globals aren’t as const as intended. Try

``````const x0 = const y0 = const z0 = -10.0
const xRes = const yRes = const zRes = 100
``````
4 Likes

Other things worth trying, applicable to the C code as well.

You don’t really need `rVec`. Compute the norm yourself. (Using the `StaticArrays` package could be an alternative but doubtful it’s a win here.)

``````        x = x0 + i*dx
y = y0 + j*dx
z = z0 + k*dx
gridR[s] = R_5_0(sqrt(x * x + y * y + z * z))
``````

Evaluate the polynomial in `R_5_0` with Horner’s method. The macro `@evalpoly` may be helpful.

The division by `300 * √5` could be multiplication by a precomputed constant.

1 Like

I did everything said here so far and the Julia code is now more than 4 times faster than the C code.
(fixing `const` declarations gave the biggest speedup, so I selected that reply as the correct answer)

``````const x0 = const y0 = const z0 = -10.0
const xRes = const yRes = const zRes = 400

# ...

function R_5_0(r::Float64)
ρ5 = 0.4*r
return exp(-0.5*ρ5) * @evalpoly(ρ5, 120, -240, 120, -20.0, 1.0) * 0.0014907119849998597
end

function fill_grid_2!(gridR::Array{Float64,3})
s = 0
@inbounds for k in 0:zRes
for j in 0:yRes
for i in 0:xRes
x = x0 + i*dx
y = y0 + j*dx
z = z0 + k*dx
gridR[s += 1] = R_5_0(√(x*x + y*y + z*z))
end
end
end
end
``````
3 Likes

Yes, I don’t know how I missed those globals. Here’s some literature on why this happens, and a few related things to look out for.

http://docs.julialang.org/en/stable/manual/performance-tips/#avoid-global-variables

1 Like