More allocation in for loop than broadcast

I am new to Julia and writing a simulation code to track the particle motion.

Here is a simplified version of my code:

using BenchmarkTools
using StructArrays
using StaticArrays

struct ps3d # 6D phase space
    x::Float64
    y::Float64
    z::Float64
end

struct Particles
    npar::Int64 # Number of macro particles
    dist::StructArray{ps3d} # distribution
    temp1::Vector{Float64} # temporary variable for pass!
    temp2::Vector{Float64} # temporary variable for pass!
    function Particles(np::Int64)
        new(np, StructArray{ps3d}(undef,np), 
            Vector{Float64}(undef, np), Vector{Float64}(undef, np))
    end
end

test=Particles(1000)
test.dist.x


struct TransferMap
    scale::SVector{3,Float64}
    linearmap::SMatrix{3,3,Float64}
    TransferMap(s::AbstractVector, m::AbstractMatrix)=new(s,m)
end

function pass!(ps::AbstractVector{ps3d}, tm::TransferMap, tempx, tempy) 
    for i in eachindex(ps.x)
        tempx[i] = tm.linearmap[1,1]*ps.x[i] + tm.linearmap[1,2]*ps.y[i] + tm.linearmap[1,3]*ps.z[I]
        tempy[i] = tm.linearmap[2,1]*ps.x[i] + tm.linearmap[2,2]*ps.y[i] + tm.linearmap[2,3]*ps.z[i]
        ps.z[i] = tm.linearmap[3,1]*ps.x[i] + tm.linearmap[3,2]*ps.y[i] + tm.linearmap[3,3]*ps.z[i]
        ps.x[i] = tempx[i]
        ps.y[i] = tempy[i]
    end
end

function pass_bcast!(ps::AbstractVector{ps3d}, tm::TransferMap, tempx, tempy) 
    tempx .= tm.linearmap[1,1].*ps.x .+ tm.linearmap[1,2].*ps.y .+ tm.linearmap[1,3].*ps.z
    tempy .= tm.linearmap[2,1].*ps.x .+ tm.linearmap[2,2].*ps.y .+ tm.linearmap[2,3].*ps.z
    ps.z .= tm.linearmap[3,1].*ps.x .+ tm.linearmap[3,2].*ps.y .+ tm.linearmap[3,3].*ps.z
    ps.x .= tempx
    ps.y .= tempy
end

@benchmark begin
    test=Particles(1000000)
    test.dist.x .= rand.()
    test.dist.y .= rand.()
    test.dist.z .= rand.()

    tm=TransferMap(SVector(1.0,1.0,1.0), @SMatrix ([1.0 0.0 0.0; 0.0 1.0 0.0; 0.0 0.0 1.0]))

    pass!(test.dist, tm, test.temp1, test.temp2)
end

With the for loop version (function pass!) I got memory allocation of 175MB with 119 ms run time; while using the broadcast version (function pass_bcast!) , the memory allocation is 38 MB with 6 ms run time.

Please advise if I did anything wrong? Thank you.

I am on mobile so can’t test but I doubt this code will run. Inside pass! you write a couple of times I instead of i. If you imported LinearAlgebra.jl as well in your main code then maybe this would run (not sure) but do something different then pass_bcast!.

Also a heads up: This initializes all array entries to the same value:

I don’t know about your specific problem, but the above field has an abstract type. Since the compiler cannot figure out that 3*3 is 9 (don’t ask me why), you need to write

linearmap::SMatrix{3,3,Float64, 9}

That might help with some of it.

1 Like

Have fix the “I” with i, messed up by grammar check plugin.

Thanks.

Thanks a lot. This fixed the gap between two functions.

Have no idea why this error won’t affect the broadcast version.