Performance of Dictionaries.jl vs Base.Dict

rocco_sprmnt21 · December 13, 2024, 10:07am

I tried to use Dictionares instead of the basic Dict, but keeping the same structure, in this elegant script by @Daniel_Gonzalez that solves the problem of the AOC 2024 d11, but I got no improvements.

I don’t know if the failure is due to a clumsy application of the Dictionares.jl package or if the script needs to be changed to take advantage of the package’s features.
I hope someone else will succeed in doing so.

one SOL od d11 AOC 2024

const NO_STONE = -1 # folks, we're writing old school C today

stones = split(readlines("2024/inputs/day11.txt") |> only, " ") .|> x -> parse(Int, x)

number_of_digits(x) = floor(Int, log10(x)) + 1

function blink(stone::Integer)
    stone == 0 && return (1, NO_STONE)
    n = number_of_digits(stone)
    iseven(n) && return divrem(stone, 10^(n ÷ 2))
    (stone * 2024, NO_STONE)
end

function solve(stones, n)
    d = Dict{Int64,Int64}()
    for s ∈ stones
        d[s] = 1
    end
    for _ in 1:n
        d_n = Dict{Int64,Int64}()
        for (s, n) ∈ d
            for s2 ∈ blink(s)
                s2 == NO_STONE && continue
                if haskey(d_n, s2)
                    d_n[s2] += n
                else
                    d_n[s2] = n
                end
            end
        end
        d = d_n
    end
    sum(values(d))
end

solve(stones, 25)
solve(stones, 75)

For those who want to try, I leave the input to use in the tests

stones=[4610211, 
       4,
       0,
      59,
    3907,
  201586,
     929,
   33750]

greatpet · December 13, 2024, 12:14pm

Since the posters above seem to suggest that Dictionaries.jl is faster for iteration but possibly slower for random access, it’s normal to not expect a speedup for your code which involves random access of the d_n Dict.

Having said that, the following part your code could potentially be made faster if the code can be rewritten to take advantage of a token API for avoiding repeated hash computations:

Dan · December 13, 2024, 1:44pm

Another version of solve saving hash calculation using mergewith!:

using OrderedCollections

function solve(stones, n)
    d_new = Dict{Int,Int}()
    d_old = Dict(stones .=> 1)
    for _ in 1:n
        for (s, n) ∈ d_old
            t1, t2 = blink(s)
            mergewith!(+, d_new,
              t2 == NO_STONE ? LittleDict((t1,),(n,)) : 
                               LittleDict((t1,t2),(n,n)))
        end
        d_old, d_new = d_new, d_old
        empty!(d_new)
    end
    sum(values(d_old))
end

rocco_sprmnt21 · December 13, 2024, 3:17pm

If I did not measure badly, the performance does not improve. On the contrary …

julia> @btime solve(stones, 25)  # 260 us
  3.339 ms (13913 allocations: 1.56 MiB)
197357

julia> @btime solve(stones, 75)  # 12ms
  155.595 ms (516906 allocations: 49.95 MiB)
234568186890978

rocco_sprmnt21 · December 13, 2024, 3:35pm

I apologize for bringing up these old messages, but for me, and perhaps for others too, they are of current interest.
I tried to use the instructions of the update function!() but I get no improvements.
Maybe I apply it wrong

julia> function solve(stones, n)
           d = Dict{Int64,Int64}()
           for s ∈ stones
               d[s] = 1
           end
           for _ in 1:n
               d_n = Dict{Int64,Int64}()
               for (s, n) ∈ d
                   for s2 ∈ blink(s)
                       s2 == NO_STONE && continue
                       index = Base.ht_keyindex2!(d_n, s2)
           if index > 0
               @inbounds d_n.vals[index] += n # = func(dict.vals[index])
           else
               @inbounds Base._setindex!(d_n, n, s2, -index)
           end
                       # if haskey(d_n, s2)
                       #     d_n[s2] += n
                       # else
                       #     d_n[s2] = n
                       # end
                   end
               end
               d = d_n
           end
           sum(values(d))
       end
solve (generic function with 1 method)

julia> solve(stones, 25)
197357

julia> solve(stones, 75)
234568186890978

julia> @btime solve(stones, 25)  # 260 us
  635.100 μs (9800 allocations: 496.95 KiB)
197357

julia> @btime solve(stones, 75)  # 12ms
  29.340 ms (590514 allocations: 23.42 MiB)
234568186890978

rocco_sprmnt21 · December 13, 2024, 3:51pm

In the case of this script, using get!() does not improve the situation.

julia> function solve(stones, n)
           d = Dict{Int64,Int64}()
           for s ∈ stones
               d[s] = 1
           end
           for _ in 1:n
               d_n = Dict{Int64,Int64}()
               for (s, n) ∈ d
                   for s2 ∈ blink(s)
                       s2 == NO_STONE && continue
                       d_n[s2] = n + get!(d_n, s2, 0)
                       # index = Base.ht_keyindex2!(d_n, s2)
                       # if index > 0
                       #     @inbounds d_n.vals[index] += n # = func(dict.vals[index])
                       # else
                       #     @inbounds Base._setindex!(d_n, n, s2, -index)       
                       # end
                       # if haskey(d_n, s2)
                       #     d_n[s2] += n
                       # else
                       #     d_n[s2] = n
                       # end
                   end
               end
               d = d_n
           end
           sum(values(d))
       end
solve (generic function with 1 method)


julia> @btime solve(stones, 25)  # 260 us
  593.500 μs (8490 allocations: 476.48 KiB)
197357

julia> @btime solve(stones, 75)  # 12ms
  26.786 ms (345428 allocations: 19.68 MiB)
234568186890978

Dan · December 13, 2024, 3:56pm

Something is not measured properly - too many allocations. Here is the same run from my REPL:

julia> @btime solve($stones, 75)
  6.904 ms (84 allocations: 511.62 KiB)
234568186890978

julia> versioninfo()
Julia Version 1.11.1

rocco_sprmnt21 · December 13, 2024, 4:04pm

using OrderedCollections

function solve(stones, n)
    d_new = Dict{Int,Int}()
    d_old = Dict(stones .=> 1)
    for _ in 1:n
        for (s, n) ∈ d_old
            t1, t2 = blink(s)
            mergewith!(+, d_new,
              t2 == NO_STONE ? LittleDict((t1,),(n,)) : 
                               LittleDict((t1,t2),(n,n)))
        end
        d_old, d_new = d_new, d_old
        empty!(d_new)
    end
    sum(values(d_old))
end



julia> solve(stones, 25)
197357

julia> solve(stones, 75)
234568186890978

julia> @btime solve(stones, 25)  # 260 us
  3.321 ms (13913 allocations: 1.56 MiB)
197357

julia> @btime solve($stones, 25)  # 260 us
  3.332 ms (13912 allocations: 1.56 MiB)
197357

julia> @btime solve($stones, 75)  # 12ms
  152.654 ms (516905 allocations: 49.95 MiB)
234568186890978


julia> versioninfo()
Julia Version 1.10.6

Someone changed the code of dict.jl!?!?

Dan · December 13, 2024, 4:48pm

That is strange.

More likely some reference to a global instead of local. Have you checked blink?

Is NO_STONE const?

rocco_sprmnt21 · December 13, 2024, 4:56pm

from a fresh session

Summary

julia> const NO_STONE = -1 # folks, we're writing old school C today
-1

julia> stones = split(readlines("input_aoc24_d11.txt") |> only, " ") .|> x -> parse(Int, x)
8-element Vector{Int64}:
 4610211
       4
       0
      59
    3907
  201586
     929
   33750

julia> number_of_digits(x) = floor(Int, log10(x)) + 1
number_of_digits (generic function with 1 method)

julia> function blink(stone::Integer)
           stone == 0 && return (1, NO_STONE)
           n = number_of_digits(stone)
           iseven(n) && return divrem(stone, 10^(n ÷ 2))
           (stone * 2024, NO_STONE)
       end
blink (generic function with 1 method)

julia> using OrderedCollections

julia> function solve(stones, n)
           d_new = Dict{Int,Int}()
           d_old = Dict(stones .=> 1)
           for _ in 1:n
               for (s, n) ∈ d_old
                   t1, t2 = blink(s)
                   mergewith!(+, d_new,
                     t2 == NO_STONE ? LittleDict((t1,),(n,)) : 
                                      LittleDict((t1,t2),(n,n)))
               end
               d_old, d_new = d_new, d_old
               empty!(d_new)
           end
           sum(values(d_old))
       end
solve (generic function with 1 method)

julia> solve(stones, 25)
197357

julia> solve(stones, 75)
234568186890978

julia> @btime solve($stones, 25)  # 260 us
ERROR: LoadError: UndefVarError: `@btime` not defined
in expression starting at c:\Users\sprmn\.julia\environments\v1.10.5\AOC_24_d11.jl:168

julia> using BenchmarkTools

julia> @btime solve($stones, 25)  # 260 us
  602.400 μs (441 allocations: 1.04 MiB)
197357

julia> 

julia> @btime solve($stones, 75)  # 12ms
  25.083 ms (3614 allocations: 30.75 MiB)
234568186890978

Dan · December 13, 2024, 5:10pm

Already a big improvement.
My results:

julia> @btime solve($stones, 75)
  6.851 ms (84 allocations: 511.62 KiB)
234568186890978

seem to be due to 1.11 changes.

rocco_sprmnt21 · December 13, 2024, 5:34pm

But isn’t the original script improved by julia 1.11?

I mean this

                       # if haskey(d_n, s2)
                       #     d_n[s2] += n
                       # else
                       #     d_n[s2] = n
                       # end

greatpet · December 13, 2024, 6:31pm

Haven’t followed all the benchmarking, but dict.jl has changed in Julia 1.11 and now uses Memory instead of Vector internally for holding data.

Topic		Replies	Views
Poor time performance on Dict? Performance	26	19011	March 12, 2018
Performance of IdDict vs Dict Performance	13	1424	May 19, 2023
[ANN] Dictionaries.jl - Improved productivity and performance of dictionaries in Julia Package Announcements dictionary , dictionaries	22	6150	December 15, 2019
What is the fastest data type out of these? General Usage question	18	1197	October 1, 2018
Best performance using dictionaries, functions and modules Performance question	8	851	June 5, 2020

Performance of Dictionaries.jl vs Base.Dict

Related topics