Memory allocation when using an array of views

Vasily_Pisarev · April 29, 2019, 8:36pm

Hello all.

I’ve started playing around with Julia and am trying to write a molecular dynamics program.
That involves dealing with a lot of short arrays. I decided to use the following structure for coordinates:

struct Views3D{T}
    vec::Vector{<:SubArray{T,1,Vector{T}}}
    data::Vector{T}
end

Here, data is a length-3N vector which holds the related vectors for N atoms, and vec[i] is data[3(i-1)+1:3i], i.e. the vector for ith particle.

The atom data are arranged as follows:

struct InnerAtoms
    num::Base.RefValue{Int} # number of particles
    r::Views3D{Float64}
    v::Views3D{Float64}
    f::Views3D{Float64}
end

For some reason, that leads to memory allocation whenever the atom coordinates are processed. E.g., the following code puts coordinates inside the box if a particle got out:

@inbounds for iatom in 1:natoms
    rvec = box.atoms.r.vec[iatom]
    for j = 1:3
        if rvec[j] < boxlo[j]
            rvec[j] += boxdim[j]
        elseif rvec[j] >= boxhi[j]
            rvec[j] -= boxdim[j]
        end
    end
end

That code, however, allocates a lot. With @code_warntype, I get the following transcript:

│     %144 = (Base.getfield)(box, :atoms)::Main.JlMol.Atoms.InnerAtoms
│     %145 = (Base.getfield)(%144, :r)::Main.JlMol.Atoms.Views3D{Float64}
│     %146 = (Base.getfield)(%145, :vec)::Array{#s36,1} where #s36<:(SubArray{Float64,1,Array{Float64,1},I,L} where L where I)
│     %147 = (Base.getindex)(%146, %142)::SubArray{Float64,1,Array{Float64,1},I,L} where L where I
└────        goto #50 if not true
41 ┄─ %149 = φ (#40 => 1, #49 => %173)::Int64
│     %150 = φ (#40 => 1, #49 => %174)::Int64
│     %151 = (Base.getindex)(%147, %149)::Any

So, the compiler knows that rvec is a subarray of Float64 type but cannot infer that rvec[i] is a Float64?

Is it possible to change the structure definition to avoid those memory allocations?

mbauman · April 29, 2019, 8:49pm

Welcome! You’re seeing allocations because accesses into the .vec field of the Views3D structure are inherently unstable. All Julia knows is that it’s a vector containing some sort of view; Vector{<:SubArray{T,1,Vector{T}}} isn’t what we call a concrete type because it doesn’t fully describe all the type parameters of the SubArray. For more details, check out this section in the performance tips: Performance Tips · The Julia Language

You can fix this with an additional type parameter:

struct Views3D{T, S<:SubArray{T,1,Vector{T}}}
    vec::Vector{S}
    data::Vector{T}
end

That said, even once you fix that you might find working with a vector of small views to be unsatisfactory — making copies or copying into a pre-allocated matrix instead of using a vector of vectors can often be more performant as it improves the locality of the data and your computer can more effectively pre-fetch and cache the data.

foobar_lv2 · April 29, 2019, 9:09pm

I am assuming that you use this structure because some of your algorithms want to work by linearly indexing the data, and some others want to work by accessing one vec at a time. This works well with Matrix, unless you need to resize the data, i.e. add more atoms at runtime. If you don’t need to add mode atoms at runtime, consider storing multiple arrays that reference the same data, via reshape.

Furthermore, I am assuming that you need to write to vec[i]; otherwise, adjust the below to return SVector. With all these caveats, I’d recommend something like

struct Views3D{T}
    data::Vector{T}
end
Base.@propagate_inbounds function Base.getindex(v::Views3D, i)
@boundscheck checkbounds(v.data, 3(i-1)+1:3i)
return view(v.data, 3(i-1)+1:3i)
end

The reason is that julia has become pretty good at inlining and eliding allocations of views. In most cases, all these views will never exist.

Vasily_Pisarev · April 29, 2019, 9:33pm

Okay, I’ve changed the Views3D structure to

struct Views3D{T}
    vec::Vector{SubArray{T,1,Vector{T},Tuple{UnitRange{Int64}},true}}
    data::Vector{T}
end

and allocations are gone. But you are right about it still being very slow.

For using a matrix… The core of the MD simulation is to compute forces between pairs of particles, given two 3D position vectors. So, I need a way to view a column as a vector anyway.

Vasily_Pisarev · April 29, 2019, 9:48pm

I am assuming that you use this structure because some of your algorithms want to work by linearly indexing the data, and some others want to work by accessing one vec at a time.

Well, actually, the main reason is to use the same structure for “proper” atoms and “ghost” atoms (copies of the inner ones that are needed to calculate forces in periodic boundary conditions). And the list of ghost atoms needs to be resized (well, it can be statically allocated, but who knows how big is big enough for the purpose).

I did try using views as:

dr = zeros(Float64, 3)
force = zeros(Float64, 3)
@inbounds for i1 = 1:natoms
    r1 = view(atoms.r.data, 3*(i1-1)+1:3*i1)
    f1 = view(atoms.f.data, 3*(i1-1)+1:3*i1)
    for i2 = i1+1:natoms
        r2 = view(atoms.r.data, 3*(i2-1)+1:3*i2)
        f2 = view(atoms.f.data, 3*(i2-1)+1:3*i2)
        @. dr = r2 - r1
        force .= getforce(dr)
        f2 .+= force
        f1 .-= force
    end
end

It allocates memory on every view creation.

ChrisRackauckas · April 29, 2019, 9:55pm

Use StaticArrays. You may want to check out NBodySimulator.jl

Topic		Replies	Views
Is there a way to guarantee 0 allocations when accessing an array? General Usage	18	4306	June 1, 2017
Array views becoming dominant source of memory allocation Performance	31	5170	November 14, 2018
Switching out the underlying array in a SubArray Performance	19	941	May 30, 2019
Problem with creating view of vectors in vector of vectors (it keeps allocating) Performance memory-allocation , vector	2	129	July 9, 2024
Memory allocation while using @views Performance performance , arrays	5	1333	August 16, 2021

Memory allocation when using an array of views

Related topics