Thread-local `Dict` for each thread?

dilumaluthge · December 8, 2020, 4:59am

How can I change my code to go from using a single global Dict to having a different thread-local Dict for each thread?

For example, suppose I have a package that looks like this:

module MyModule

const MYDICT = Dict{Symbol, Float64}()
my_get(x::Symbol) = Base.getindex(MYDICT, x)
my_set(x::Symbol, y::Float64) = Base.setindex!(MYDICT, x, y)

end

This won’t be thread-safe if I have multiple threads. Is there a way for me to have a separate Dict for each thread?

pbayer · December 8, 2020, 8:00am

Maybe you consider an actor approach to make your Dict thread-safe. Consider the following toy example:

julia> using Guards, .Threads

julia> inc(x) = x[1]+=1
inc (generic function with 1 method)

julia> myDict = guard(Dict{Int, Float64}())
Guard{Dict{Int64,Float64}}(Link{Channel{Any}}(Channel{Any}(sz_max:32,sz_curr:0), 1, :guard))

julia> count = guard([0])
Guard{Array{Int64,1}}(Link{Channel{Any}}(Channel{Any}(sz_max:32,sz_curr:0), 1, :guard))

julia> @threads for _ in 1:1000
            myDict[@grd inc(count)] = rand()
       end

julia> call(count)
1-element Array{Int64,1}:
 1000

julia> length(@grd keys(myDict))
1000

julia> myDict[513]
0.37190425482220446

Guards is part of JuliaActors and will be available later this day from the Julia registry.

Otherwise you could have also thread specific Dicts with Actors or Guards by simply starting them with keyword thrd=x. But then you have to figure out a mechanism to pass the thread-specific Dict to your tasks. And also your data won’t be consistent between threads. The data a task sees would depend from the thread it is on.

Isn’t a Dict a data structure to keep data consistent in space and time? Why let threads be a classification criterion for data?

pixel27 · December 8, 2020, 3:08pm

There are the task_local methods:

http://mortenpi.eu/julia/pretty-urls/stdlib/parallel/#Base.task_local_storage-Tuple{Any}

My guess is that they are thread safe. But each task will have it’s own instance NOT thread. If you want it to be based off each thread then something like:

using Base.Threads

struct ThreadDict
    protect::ReentrantLock
    dicts::Dict{Int, Dict{Symbol, Float64}}
   ThreadDict() = new(ReentrantLock(), Dict{Int, Dict{Symbol, Float64}}())
end

function Base.getindex(t::ThreadDict, x)
    lock(t.protect) do
        if haskey(t.dicts, threadid()) == false
            t.dicts[threadid()] = Dict{Symbol, Float64}()
        end
        return t.dicts[threadid()][x]
    end
end

function Base.setindex!(t::ThreadDict, x, y)
    lock(t.protect) do
        if haskey(t.dicts, threadid()) == false
            t.dicts[threadid()] = Dict{Symbol, Float64}()
        end
        t.dicts[threadid()][y] = x
    end
end

Performance for get/set should be fine, you could change the ReentrantLock to a SpinLock and you might get better performance. If you ever need to iterate over the values in the Dict that’s going to introduce a longer lock because you need to keep the lock over the entire iteration and you could see more thread contention.

kristoffer.carlsson · December 8, 2020, 3:38pm

You create one dict for each thread, put them in a vector, and when you want one you index with the thread id. Same as:

github.com

JuliaLang/julia/blob/af0006e1edd25b67891bb86330e294e41a036b3d/stdlib/Random/src/RNGs.jl#L369-L381


      
          const THREAD_RNGs = MersenneTwister[]
          @inline default_rng() = default_rng(Threads.threadid())
          @noinline function default_rng(tid::Int)
              0 < tid <= length(THREAD_RNGs) || _rng_length_assert()
              if @inbounds isassigned(THREAD_RNGs, tid)
                  @inbounds MT = THREAD_RNGs[tid]
              else
                  MT = MersenneTwister()
                  @inbounds THREAD_RNGs[tid] = MT
              end
              return MT
          end
          @noinline _rng_length_assert() =  @assert false "0 < tid <= length(THREAD_RNGs)"

Topic		Replies	Views
Can dicts be threadsafe? General Usage multithreading	17	6228	September 22, 2022
Thread-safe dict? General Usage	2	1161	April 20, 2024
Dicts in different @async blocks New to Julia question	8	519	April 23, 2020
How to process a dictionary with multi tasking? New to Julia multithreading	9	222	November 3, 2024
Length of Dict is not thread safe General Usage multithreading	5	680	January 20, 2022

Thread-local `Dict` for each thread?

Related topics