Sorting a StructArray variable faster and how to update values?

Hello!

I have a runnable MWE here, put into a small module to easily redefine struct as needed:

module ParticleSorting

export sort_particles_by_density

using StructArrays
using StaticArrays
using BenchmarkTools
using Random

# Define a simple particle struct
struct Particle
    position :: SVector{2,Float64}
    velocity :: SVector{2,Float64}
    density  :: Float64
    index    :: CartesianIndex{2}
    boundary_bool :: Bool
end

# Function to create a StructArray with random data for demonstration
function create_particles(n)
    positions     = rand(SVector{2,Float64}, n)
    velocities    = rand(SVector{2,Float64}, n) * 100
    densities     = rand(n) * 1000
    index         = [CartesianIndex(rand(1:100), rand(1:100)) for _ in 1:n]
    boundary_bool = rand(Bool, n)
    StructArray{Particle}((position = positions, velocity = velocities, density = densities, index = index, boundary_bool = boundary_bool))
end

# Function to sort the StructArray by the density field
function sort_particles_by_index(particles)
    sort!(particles, by = p -> p.index)
end

# Function to benchmark sorting
function benchmark_sorting(particles)
    # The setup block: shuffle the particles array; this part is not timed
    setup = (particles = shuffle!(particles);)

    @benchmark sort_particles_by_index($particles) setup=$setup
end

end # module ParticleSorting


using .ParticleSorting

# Create a StructArray with random particles
particles = ParticleSorting.create_particles(100000)

# Benchmark sorting by density
benchmark_result = ParticleSorting.benchmark_sorting(particles)

# Print benchmark result
display(benchmark_result)

My need is that I have a list of vectors which I need to sort based on the index field. I’ve found it is faster to put everything in a StructVector and call sort! on that. My benchmark result is something like this:

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  125.900 ΞΌs …   5.389 ms  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     132.100 ΞΌs               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   164.548 ΞΌs Β± 103.733 ΞΌs  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

  β–ˆβ–ˆβ–„β–ƒβ–ƒβ–‚β–‚β–‚β–β–β–β–β–β–β–β–β–β–β–β–β–β–‚β–‚β–β–                                     ▁
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–‡β–‡β–‡β–‡β–‡β–†β–‡β–†β–…β–…β–†β–…β–†β–„β–…β–†β–…β–…β–„β–…β–…β–…β–„β–„β–ƒβ–„β–…β–ƒβ–„β–„β–„ β–ˆ
  126 ΞΌs        Histogram: log(frequency) by time        480 ΞΌs <

 Memory estimate: 0 bytes, allocs estimate: 0.

Which is quite amazing, but I wondered if it can be done faster?

Secondly, after having sorted the values I want to update a value:

julia> particles[1].velocity = particles[2].velocity
ERROR: setfield!: immutable struct of type Particle cannot be changed
Stacktrace:
 [1] setproperty!(x::Main.ParticleSorting.Particle, f::Symbol, v::StaticArraysCore.SVector{2, Float64})
   @ Base .\Base.jl:41
 [2] top-level scope
   @ REPL[4]:1

I think the point of a StructVector is a bit lost, if I can’t update values at some point, so I am curios as to how to do this properly? I understand that the SVector it self is immutable, I don’t understand why I canΒ΄t replace it with a new one in the StructVector.

Kind regards

For setting values, you may use Accessors:

julia> particles = ParticleSorting.create_particles(100000);

julia> particles[1].velocity == particles[2].velocity
false

julia> using Accessors

julia> @reset particles[1].velocity = particles[2].velocity;

julia> particles[1].velocity == particles[2].velocity
true

julia> particles[1] == particles[2]
false

sort! is falling back to the default implementation, so it’s worth checking if applying the permutation array-wise is faster.

1 Like

This does not accomplish what you think it does. You do not shuffle the particles in between runs. The variable setup just contains the shuffled particles and when you interpolate that into the setup block of the @benchmark then it is essentially a no-op.

You should write this in one go:

function benchmark_sorting(particles)
    # The setup block: shuffle the particles array; this part is not timed

    @benchmark sort_particles_by_index(particles_shuffled) setup=(particles_shuffled = shuffle!(particles);) samples=1
end

Where also set samples to 1 to ensure that the particles are shuffled each time.

1 Like

In the past when I used sortperm! since I have to update 9 arrays in my actual use-case, sort! was faster. In regards to indexing I found out I was using it wrong, I could simply do:

particles.position[1] = particles.position[2]

Which seems to have non run-issues, so just user error by me

Thanks!

I’ve updated the benchmark to be correct as you suggest:

module ParticleSorting

export sort_particles_by_density

using StructArrays
using StaticArrays
using BenchmarkTools
using Random

# Define a simple particle struct
struct Particle
    position :: SVector{2,Float64}
    velocity :: SVector{2,Float64}
    density  :: Float64
    index    :: CartesianIndex{2}
    boundary_bool :: Bool
end

# Function to create a StructArray with random data for demonstration
function create_particles(n)
    positions     = rand(SVector{2,Float64}, n)
    velocities    = rand(SVector{2,Float64}, n) * 100
    densities     = rand(n) * 1000
    index         = [CartesianIndex(rand(1:100), rand(1:100)) for _ in 1:n]
    boundary_bool = rand(Bool, n)
    StructArray{Particle}((position = positions, velocity = velocities, density = densities, index = index, boundary_bool = boundary_bool))
end

function sort_by_index(particles)
    sort!(particles, by = p -> p.index)
end

# Function to benchmark sorting
function benchmark_sort_by_index(particles)
    # The setup block: shuffle the particles array; this part is not timed

    @benchmark  sort_by_index(particles_shuffled) setup=(particles_shuffled = shuffle!($particles);) samples=1
end



end # module ParticleSorting

using JET
using .ParticleSorting

N = 100_000

# Create a StructArray with random particles
particles = ParticleSorting.create_particles(N)

# Benchmark sorting by density
benchmark_result = ParticleSorting.benchmark_sort_by_index(particles); display(benchmark_result)

println(@report_opt target_modules=(@__MODULE__,) ParticleSorting.sort_by_index(particles))
println(@report_call target_modules=(@__MODULE__,) ParticleSorting.sort_by_index(particles))
@code_warntype ParticleSorting.sort_by_index(particles)

Now I get a timing of:

BenchmarkTools.Trial: 1 sample with 1 evaluation.
 Single result which took 14.346 ms (0.00% GC) to evaluate,
 with a memory estimate of 6.10 MiB, over 2 allocations.

Analyzing it with JET and @code_warntype:

No errors detected

No errors detected

MethodInstance for Main.ParticleSorting.sort_by_index(::StructVector{Main.ParticleSorting.Particle, @NamedTuple{position::Vector{SVector{2, Float64}}, velocity::Vector{SVector{2, Float64}}, density::Vector{Float64}, index::Vector{CartesianIndex{2}}, boundary_bool::Vector{Bool}}, Int64})
  from sort_by_index(particles) @ Main.ParticleSorting c:\git\SPHExample\test\faster_sorting.jl:29
Arguments
  #self#::Core.Const(Main.ParticleSorting.sort_by_index)
  particles::StructVector{Main.ParticleSorting.Particle, @NamedTuple{position::Vector{SVector{2, Float64}}, velocity::Vector{SVector{2, Float64}}, density::Vector{Float64}, index::Vector{CartesianIndex{2}}, boundary_bool::Vector{Bool}}, Int64}
Locals
  #3::Main.ParticleSorting.var"#3#4"
Body::StructVector{Main.ParticleSorting.Particle, @NamedTuple{position::Vector{SVector{2, Float64}}, velocity::Vector{SVector{2, Float64}}, density::Vector{Float64}, index::Vector{CartesianIndex{2}}, boundary_bool::Vector{Bool}}, Int64}
1 ─      (#3 = %new(Main.ParticleSorting.:(var"#3#4")))
β”‚   %2 = #3::Core.Const(Main.ParticleSorting.var"#3#4"())
β”‚   %3 = (:by,)::Core.Const((:by,))
β”‚   %4 = Core.apply_type(Core.NamedTuple, %3)::Core.Const(NamedTuple{(:by,)})
β”‚   %5 = Core.tuple(%2)::Core.Const((Main.ParticleSorting.var"#3#4"(),))
β”‚   %6 = (%4)(%5)::Core.Const((by = Main.ParticleSorting.var"#3#4"(),))
β”‚   %7 = Core.kwcall(%6, Main.ParticleSorting.sort!, particles)::StructVector{Main.ParticleSorting.Particle, @NamedTuple{position::Vector{SVector{2, Float64}}, velocity::Vector{SVector{2, Float64}}, density::Vector{Float64}, index::Vector{CartesianIndex{2}}, boundary_bool::Vector{Bool}}, Int64}
└──      return %7

Which shows that everything should be correctly inferred I think

Kind regards

1 Like

Ah sorry samples=1 was wrong. It should have been evals=1 I think. You see that right now you have no statistics in your benchmarking (1 sample with 1 eval) but all I wanted to do was having many samples with only 1 eval because IIRC BenchmarkTools.jl runs the setup in between samples but not in between evaluations.

Thanks, evals gives the same result but now in a nicer format:

BenchmarkTools.Trial: 262 samples with 1 evaluation.
 Range (min … max):  13.489 ms … 25.261 ms  β”Š GC (min … max): 0.00% … 20.56%
 Time  (median):     15.602 ms              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   16.381 ms Β±  2.480 ms  β”Š GC (mean Β± Οƒ):  3.61% Β±  7.48%

      β–†β–‡β–…β–ƒβ–…β–ˆ β–‚β–ƒβ–„β–„
  β–ƒβ–…β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–…β–ˆβ–†β–…β–†β–‡β–…β–ƒβ–…β–ƒβ–β–β–β–ƒβ–ƒβ–„β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–„β–ƒβ–„β–ƒβ–ƒβ–β–ƒβ–ƒβ–β–ƒβ–„β–ƒβ–ƒβ–β–ƒβ–ƒβ–β–β–ƒβ–ƒ β–ƒ
  13.5 ms         Histogram: frequency by time          24 ms <

 Memory estimate: 6.10 MiB, allocs estimate: 2.