Length of Dict is not thread safe

Jason89757 · January 20, 2022, 5:30am

I’m trying to write something providing mysql-like functions: insert one row then return id.

d = Dict{Int64,Int64}()
Threads.@threads for i in 1:10000
  if !haskey(d,i)
    d[i] = length(d)+1
  end
end

julia> length(d)
5670

julia> collect(d)
ERROR: ArgumentError: destination has fewer elements than required
Stacktrace:
 [1] copyto!(dest::Vector{Pair{Int64, Int64}}, src::Dict{Int64, Int64})
   @ Base ./abstractarray.jl:897
 [2] _collect
   @ ./array.jl:715 [inlined]
 [3] collect(itr::Dict{Int64, Int64})
   @ Base ./array.jl:709
 [4] top-level scope
   @ REPL[5]:1

I can use an extra Vector to fix this, but is there another better solution?

goerch · January 20, 2022, 5:42am

How to prevent data races: Multi-Threading · The Julia Language

Jason89757 · January 20, 2022, 6:03am

Thanks a lot!
Struct a type with data and lock can fix this.

Jason89757 · January 20, 2022, 6:14am

found a solution:
https://github.com/wherrera10/ThreadSafeDicts.jl

tkf · January 20, 2022, 6:55am

I suggest just protect the plain Dict with a ReentrantLock. Protecting the integrity of the dictionary itself is not sufficient for the correctness of your program. Consider the following scenario involving two tasks T1 and T2:

T1 wants a new id for key 123
T2 wants a new id for key 456
T1 calls length(d), get 41
T2 calls length(d), get 41
T1 inserts d[123]= 42
T2 inserts d[456]= 42

Now two keys refer to the same ID and no keys have id 43.

The code in the OP would be fine if you put a lock around if !haskey(d,i) ... end.

There are a lot of clever techniques to make this efficient. But reasoning about concurrent programs requires a lot more thinking than some comments fit in a discourse post.

lawless-m · January 20, 2022, 8:08am

I don’t want to be deliberately snarky but “what made you think it would be?”

When entering the world of concurrency, one has to think carefully.

Topic		Replies	Views
Can dicts be threadsafe? General Usage multithreading	17	6308	September 22, 2022
Enumerate with keys from a dict New to Julia	3	714	July 2, 2020
Does @threads not allow iteration over Dict()? New to Julia question	2	686	January 29, 2020
Dicts in different @async blocks New to Julia question	8	527	April 23, 2020
Thread-local `Dict` for each thread? New to Julia multithreading	3	851	December 8, 2020

Length of Dict is not thread safe

Related topics