Faster updates in a priority queue

shawnfrost · January 14, 2021, 7:12pm

Hi. I’m currently experiencing a bottleneck in my simulations where a relatively large amount of time is spent on the setindex! function from DataStructures.jl’s implementation of a priority queue. I have a representation of a cell and a priority queue to hold reactions on the basis of when they will occur, and I update elements of this queue at the end of each step. I have included a very simple form of my simulation: the main function here is updatePriority. This function loops over the range of reactions, checks if a reaction is possible and then updates the priority of that reaction. Are there any implementations of a priority queue in Julia, or any data structure that can operate as one, that allows more efficient updates with O(1) lookup for the minimum? My priority queue will never change in size: once the reactions are queued up, they will only ever have their priorities changed so I’m willing to trade efficiency in insertions/deletions in exchange for faster updates. I’m also wondering where so much of the memory allocations in my benchmark is coming from: if I am updating 34 64 bit floating point values 100,000 times that comes to around 27.2MiB of memory but I allocate almost 6x that amount.

using DataStructures
using Distributions

struct HexCell
    # Representation of a cell.
    propensity::Array{Float64}
    internalTime::Array{Float64}
    fireTime::Array{Float64}
    pQueue::PriorityQueue{Int,Float64}
end

function updatePriority(cell::HexCell)
    # Either set the time step for each of the 34 reactions to some value,
    # or to Inf. The primary function to benchmark. 
    for i in 1:34
        cell.pQueue[i] = cell.propensity[i] > 0 ? (cell.fireTime[i]-cell.internalTime[i])/cell.propensity[i] : Inf
    end
end

function initialize()
    # Construct the cell and enqueue all of the reactions.
    cell = HexCell(rand(Uniform(-1,1), 34),
                rand(Uniform(0,1), 34),
                rand(Uniform(0,1), 34),
                PriorityQueue{Int,Float64}())
    for i in 1:34
        delT = cell.propensity[i] > 0 ? (cell.fireTime[i]-cell.internalTime[i])/cell.propensity[i] : Inf
        enqueue!(cell.pQueue, i, delT)
    end
    return cell
end

function simulate(cell::HexCell)
    for i in 1:100_000
        updatePriority(cell)
    end
end

Here are some benchmarks for the simulation.

julia> c = initialize()
julia> @benchmark simulate(c)
BenchmarkTools.Trial: 
  memory estimate:  172.42 MiB
  allocs estimate:  11300000
  --------------
  minimum time:     384.554 ms (1.11% GC)
  median time:      389.894 ms (1.59% GC)
  mean time:        390.395 ms (1.72% GC)
  maximum time:     398.028 ms (1.98% GC)
  --------------
  samples:          13
  evals/sample:     1

Thanks in advance!

greg_plowman · January 14, 2021, 9:18pm

If you update all the priorities together, then you might consider using a vector and sorted index.
Sort once only after all entries in vector are updated.

struct HexCell2
    # Representation of a cell.
    propensity::Array{Float64}
    internalTime::Array{Float64}
    fireTime::Array{Float64}
    
    # rTime[priority[i]] contains ith element in priority queue
    rTime::Vector{Float64}
    priority::Vector{Int}  # contains indices 1:34 sorted by values of rTime, init to 1:34
end

function updatePriority(cell::HexCell2)
    # Either set the time step for each of the 34 reactions to some value,
    # or to Inf. The primary function to benchmark.
    for i in 1:34
        cell.rTime[i] = cell.propensity[i] > 0 ? (cell.fireTime[i]-cell.internalTime[i])/cell.propensity[i] : Inf
    end
    sortperm!(cell.priority, cell.rTime, initialized=true)
end

isaacsas · January 14, 2021, 9:50pm

Have you tried using a mutable binary heap instead? If I remember right the DataStructures.jl priority queue is backed by a Dict, whereas the heap uses an array. At least last time I checked it the latter was much faster for updating-heavy workflows.

More generally though, is this a stochastic chemical kinetics simulation where you are using the next reaction method? We’ve found in DiffEqJump that the NRM is rarely the best method, with the rejection SSA (RSSA), RSSA with composition-rejection (RSSACR), or just the classical Gillespie direct method faster for different reaction network topologies. (Note, this isn’t necessarily true for systems where every species has a population of only 1 or 0 as one might have in an agent-based simulation, but has been true for the chemical reaction networks we’ve looked at.)

isaacsas · January 14, 2021, 9:53pm

With all that said, a cache-oblivious binary heap with updating, or non-binary heap with updating could likely be even faster if you are just doing updates. I’m not sure there are any released packages though with such a data structure implemented.

shawnfrost · January 14, 2021, 10:24pm

Yes, the actual program is a stochastic simulation using the next reaction method. However, it uses a modified form of NRM that incorporates delayed reactions, such as mRNA transcription. I thought NRM would be faster however if the other algorithms do work better in practice, I’ll try and implement their delayed versions and see how they fare up. I did briefly look at some packages in Julia to see if this algorithm was already built in, such as BioSimulator.jl and DifferentialEquations.jl, but I think those packages don’t have the ability to model reactions with delays.

Topic		Replies	Views
Priority queue choice Performance	24	3182	September 1, 2022
Fastest data structure for a priority queue Performance data_structures	23	3957	September 24, 2021
PriorityQueue and containers Data	3	2232	July 12, 2017
Fast(er) priority queues? Performance question	20	1939	September 1, 2022
I want to use a PriorityQueue equivalent with duplicate keys General Usage question , data_structures	4	319	December 14, 2023

Faster updates in a priority queue

Related topics