I’m trying to write something providing mysql-like functions: insert one row then return id.
d = Dict{Int64,Int64}()
Threads.@threads for i in 1:10000
if !haskey(d,i)
d[i] = length(d)+1
end
end
julia> length(d)
5670
julia> collect(d)
ERROR: ArgumentError: destination has fewer elements than required
Stacktrace:
[1] copyto!(dest::Vector{Pair{Int64, Int64}}, src::Dict{Int64, Int64})
@ Base ./abstractarray.jl:897
[2] _collect
@ ./array.jl:715 [inlined]
[3] collect(itr::Dict{Int64, Int64})
@ Base ./array.jl:709
[4] top-level scope
@ REPL[5]:1
I can use an extra Vector to fix this, but is there another better solution?
I suggest just protect the plain Dict with a ReentrantLock. Protecting the integrity of the dictionary itself is not sufficient for the correctness of your program. Consider the following scenario involving two tasks T1 and T2:
T1 wants a new id for key 123
T2 wants a new id for key 456
T1 calls length(d), get 41
T2 calls length(d), get 41
T1 inserts d[123]= 42
T2 inserts d[456]= 42
Now two keys refer to the same ID and no keys have id 43.
The code in the OP would be fine if you put a lock around if !haskey(d,i) ... end.
There are a lot of clever techniques to make this efficient. But reasoning about concurrent programs requires a lot more thinking than some comments fit in a discourse post.