Thread-local `Dict` for each thread?

How can I change my code to go from using a single global Dict to having a different thread-local Dict for each thread?

For example, suppose I have a package that looks like this:

module MyModule

const MYDICT = Dict{Symbol, Float64}()
my_get(x::Symbol) = Base.getindex(MYDICT, x)
my_set(x::Symbol, y::Float64) = Base.setindex!(MYDICT, x, y)

end

This won’t be thread-safe if I have multiple threads. Is there a way for me to have a separate Dict for each thread?

Maybe you consider an actor approach to make your Dict thread-safe. Consider the following toy example:

julia> using Guards, .Threads

julia> inc(x) = x[1]+=1
inc (generic function with 1 method)

julia> myDict = guard(Dict{Int, Float64}())
Guard{Dict{Int64,Float64}}(Link{Channel{Any}}(Channel{Any}(sz_max:32,sz_curr:0), 1, :guard))

julia> count = guard([0])
Guard{Array{Int64,1}}(Link{Channel{Any}}(Channel{Any}(sz_max:32,sz_curr:0), 1, :guard))

julia> @threads for _ in 1:1000
            myDict[@grd inc(count)] = rand()
       end

julia> call(count)
1-element Array{Int64,1}:
 1000

julia> length(@grd keys(myDict))
1000

julia> myDict[513]
0.37190425482220446

Guards is part of JuliaActors and will be available later this day from the Julia registry.

Otherwise you could have also thread specific Dicts with Actors or Guards by simply starting them with keyword thrd=x. But then you have to figure out a mechanism to pass the thread-specific Dict to your tasks. And also your data won’t be consistent between threads. The data a task sees would depend from the thread it is on.

Isn’t a Dict a data structure to keep data consistent in space and time? Why let threads be a classification criterion for data?

There are the task_local methods:

http://mortenpi.eu/julia/pretty-urls/stdlib/parallel/#Base.task_local_storage-Tuple{Any}

My guess is that they are thread safe. But each task will have it’s own instance NOT thread. If you want it to be based off each thread then something like:

using Base.Threads

struct ThreadDict
    protect::ReentrantLock
    dicts::Dict{Int, Dict{Symbol, Float64}}
   ThreadDict() = new(ReentrantLock(), Dict{Int, Dict{Symbol, Float64}}())
end

function Base.getindex(t::ThreadDict, x)
    lock(t.protect) do
        if haskey(t.dicts, threadid()) == false
            t.dicts[threadid()] = Dict{Symbol, Float64}()
        end
        return t.dicts[threadid()][x]
    end
end

function Base.setindex!(t::ThreadDict, x, y)
    lock(t.protect) do
        if haskey(t.dicts, threadid()) == false
            t.dicts[threadid()] = Dict{Symbol, Float64}()
        end
        t.dicts[threadid()][y] = x
    end
end

Performance for get/set should be fine, you could change the ReentrantLock to a SpinLock and you might get better performance. If you ever need to iterate over the values in the Dict that’s going to introduce a longer lock because you need to keep the lock over the entire iteration and you could see more thread contention.

1 Like

You create one dict for each thread, put them in a vector, and when you want one you index with the thread id. Same as:

6 Likes