StructArrays creating a lot of allocations compared to arrays

I am converting a molecular dynamics code from passing arrays, to instead holding information in StructArray’s. Unfortunately, the code is now 20 times slower.

Below I have two identical MWE, one doing the force calculations using arrays, the other, being passed a StructArray.

I compare the two methods using ‘@btime

The StructArray outputs 192.524 ms (4973027 allocations: 129.28 MiB)
The Array version outputs 32.745 ms (2 allocations: 23.52 KiB)

This is a big difference, but, the difference is even bigger in the actual code. It is about 20x slower.

Any ideas how I can get rid of the allocations when using StructArrays?

using ForwardDiff
using StaticArrays
using StructArrays
using BenchmarkTools


lj_grad(x, y, c) = -ForwardDiff.gradient(x -> lj_atom_pair_energy(x, y, c), x)

"Lennard Jones energy - we take its derivative to get force"
@inline function lj_atom_pair_energy(r1::SVector{3}, r2::SVector{3},
                            cutoff::Real
                            )
    dx = abs(r1[1] - r2[1]) # made up mirror image for MWE
    dy = abs(r1[2] - r2[2])
    dz = abs(r1[3] - r2[3])
    rij_sq = dx * dx + dy * dy + dz * dz
    if  rij_sq > cutoff^2 
        return 0.0 * rij_sq
    end
    sr2 = 1 / rij_sq
    e = 4  * (sr2^6 - sr2^3) 
    return e
end

mutable struct SimulationArrays 
    atom_arrays::StructArray
    other_stuff::Float64
end

"Make an array of structs, then convert to StructArray"
mutable struct ParticleType
    r::SVector{3, Float64}
    ID::Int64
    other_stuff::Float64
end

"Force calculation using StructArrays"
@inline function total_lj_force(
    simulation_arrays::SimulationArrays,
    cutoff::T) where T

    n::Int64 = length(simulation_arrays.atom_arrays.r[:])
    forces = [SVector{3}(0.0, 0.0, 0.0) for i=1:n ]

    for i = 1:(n-1)
        for j = (i+1):n
            dE_dr = lj_grad(
                simulation_arrays.atom_arrays.r[i],
                simulation_arrays.atom_arrays.r[j],
                cutoff
                )
            forces[i] = forces[i] + dE_dr
            forces[j] = forces[j] - dE_dr
        end
    end
    return forces
end

"Force calculation using standard arrays"
@inline function total_lj_force(
    r::Vector,
    cutoff::T) where T

    n::Int64 = length(r)
    forces = [SVector{3}(0.0, 0.0, 0.0) for i=1:n ]

    for i = 1:(n-1)
        for j = (i+1):n
            dE_dr = lj_grad(r[i], r[j], cutoff)
            forces[i] = forces[i] + dE_dr
            forces[j] = forces[j] - dE_dr
        end
    end
    return forces
end

function analytical_total_force(simulation_arrays::SimulationArrays, cutoff)
    """total energy of the system, StructArrays version"""
    return total_lj_force(simulation_arrays, cutoff)
end
function analytical_total_force(r::Vector, cutoff)
    """total energy of the system, simple arrays version"""
    return total_lj_force(r, cutoff)
end
#####################################
#
#           START OF TEST
#
####################################
r = rand(SVector{3}, 1000) * 5.0 # a cubic box of size 5
cutoff = 5.0 / 2

array_of_structs = Vector{ParticleType}(undef, length(r))
for i=1:length(r) 
    array_of_structs[i] = ParticleType(r[i], 1, rand())
end

struct_array = StructArray(array_of_structs)
simulation_arrays = SimulationArrays(struct_array, rand())

@btime analytical_total_force($simulation_arrays, $cutoff)
@btime analytical_total_force($r, $cutoff)

@code_warntype analytical_total_force(simulation_arrays, cutoff)

StructArray is an abstract type.
So, you have to parameterize the SimulationArrays struct to make atom_arrays have a concrete type:

mutable struct SimulationArrays{S<:StructArray}
    atom_arrays::S
    other_stuff::Float64
end

Or you can wrap the double loop in total_lj_force(::SimulationArrays, ...) in a function operating on simulation_arrays.atom_arrays.r and forces.

3 Likes

Parameterizing the struct did the trick and is really simple. Much appreciated!

now the timings are

StructArray:    33.797 ms (4 allocations: 47.03 KiB)
Array:             33.116 ms (2 allocations: 23.52 KiB)
2 Likes