Dict with integer keys and fastest insertion?

Leo_I · January 24, 2025, 3:44pm

What are the best options for a dict/map with integer keys and the fastest insert/update? For background, I wish to aggregate those position-weight pairs (p,w) where the position is the same, I’m tinkering with the idea of having a sparse matrix be implemented as a list of dict’s, one for each column, and the matrix product needs this aggregation. So far, my benchmarks are:

function f0(pp::Vector{tp}, ww::Vector{tw}, Z::Vector{tw})  where {tp,tw}
    # aggregating with a dense vector Z
    for (p,w) ∈ zip(pp,ww)  Z[p] += w end;   
    _pp = findall(Z .≠ 0)
    _ww = Z[_pp]
    Z .= 0  # clean up
    return _pp, _ww end;

function _add!(D::Dict{tp,tw}, p::tp, w::tw)  where {tp, tw}
    idx = Base.ht_keyindex(D,p)
    if idx<0  D[p] = w  
    else D.vals[idx] += w end end;

function f1(pp::Vector{tp}, ww::Vector{tw}, D::Dict{tp,tw})   where {tp,tw}
    # aggregating with a dict
    for (p,w) ∈ zip(pp,ww)  
        @inline _add!(D,p,w) end 
    filter!(pw -> !iszero(pw[2]), D) end;  # clean up

n, N =10^5, 10^7;
pp, ww = UInt32.(rand(1:n,N)), rand(-1:0.001:1,N);
Z, D = fill(0.0, n), Dict{UInt32,Float64}();
@time f0(pp, ww, Z);
@time f1(pp ,ww, D);
  0.034545 seconds (11 allocations: 1.538 MiB)
  0.212361 seconds (39 allocations: 4.334 MiB)

jling · January 24, 2025, 6:45pm

findall(!iszero, Z) might be faster

Leo_I · January 24, 2025, 7:14pm

Is there a way to also specify the type of elements in output of findall?
Something like findall(UInt32, !iszero, Z)?

stevengj · January 24, 2025, 7:21pm

Probably even better to overwrite pp and ww in-place if you can (since you are already overwriting Z), and merge the loops, e.g.

function f0!(pp::AbstractVector, ww::AbstractVector, Z::AbstractVector)
    for (p,w) ∈ zip(pp,ww); Z[p] += w; end
    ip, iw = firstindex(pp)-1, firstindex(ww)-1
    for p in eachindex(Z)
        @inbounds if Z[p] ≠ 0
            pp[ip += 1] = p
            ww[iw += 1] = Z[p]
        end
    end
    return resize!(pp, ip - firstindex(pp)+1),
           resize!(ww, iw - firstindex(ww)+1)
end

(Note that argument-type declarations like pp::Vector{tp} make no difference whatsoever to the performance; they just make your function less generic.)

Leo_I · January 24, 2025, 8:16pm

No, in the case of matrix multiplication, I cannot overwrite pp and ww. But thank you! Any idea about fast Dict’s that I mentioned?

Dan · January 24, 2025, 10:54pm

These two will have very different memory footprints. The former is proportional to domain size, the latter is proportional to size of support of pp values. This may determine the method to use. If f0 memory footprint is acceptable, it should be hard to beat in terms of performance.

can be written as: filter!((!iszero)∘last, D)

And _add! can be elided with the use of mergewith!:

using OrderedCollections
f2(pp::Vector{tp}, ww::Vector{tw}, D::Dict{tp,tw})   where {tp,tw} =
  filter!((!iszero)∘last, mergewith!(+, D, LittleDict(pp,ww)))

Finally, better to benchmark with @btime (in BenchmarkTools) instead of @time (due to non-determinism and compilation time)

Leo_I · January 25, 2025, 8:50am

This last suggestion is much slower:

@time f0(pp,ww,Z);  @time f1(pp,ww,D1); @time f2(pp,ww,D2);
  0.054900 seconds (11 allocations: 1.538 MiB)
  0.243859 seconds (39 allocations: 4.334 MiB)
  0.871619 seconds (6 allocations: 208.000 MiB, 4.03% gc time)

I also tried Base.IdDict and Dictionaries.UnorderedDictionary, but both were slower than f1.

Dan · January 25, 2025, 12:06pm

@time should be run twice to avoid including compilation times.

stevengj · January 25, 2025, 1:28pm

Or better yet, using BenchmarkTools.jl or ChairMarks.jl to perform more careful repeated timing.

(However, you have to be careful to include a setup phase to re-initialize the arguments between benchmarks, since these functions modify their third argument.)

Topic		Replies	Views
Fast Dict value modification by accessing Dict.vals Performance	16	3649	March 14, 2019
Performance of IdDict vs Dict Performance	13	1457	May 19, 2023
Performance of Dictionaries.jl vs Base.Dict Internals & Design package , performance , dictionary , dictionaries	32	3225	December 13, 2024
Small <: AbstractDict implementation General Usage question	9	1121	February 13, 2019
Document / export `ht_keyindex()` from dict.jl? Internals & Design performance , dictionary	5	677	August 8, 2022

Dict with integer keys and fastest insertion?

Related topics