How to make JuliaDB writes thread-safe

The documentation for JuliaDB doesn’t provide a thread-safe interface for writes. Thus, the code below gives unexpected results. Is there a recommended way to do this?

I have a hunch there may be an implementation utilizing some kind of explicit lock on the table, but I’m at a loss regarding these ‘First Steps’ ;).

using JuliaDB, Random

struct ModelState
  iter::Int # this name is meaningless

myjdb2 = JuliaDB.table((modelname=String[],

Threads.@threads for i in 1:10
  mod = ModelState(i,rand(2,2))

This is missing the seventh entry:

julia> myjdb2
Table with 9 rows, 2 columns:
modelname  state
"foo1"     ModelState(1, Float32[0.236033 0.312707; 0.346517 0.00790928])
"foo3"     ModelState(3, Float32[0.811698 0.807622; 0.988432 0.970091])
"foo8"     ModelState(8, Float32[0.0123577 0.880851; 0.894287 0.769253])
"foo2"     ModelState(2, Float32[0.366796 0.210256; 0.523879 0.819338])
"foo10"    ModelState(10, Float32[0.112582 0.344454; 0.368314 0.0566454])
"foo5"     ModelState(5, Float32[0.530365 0.30131; 0.777009 0.881349])
"foo6"     ModelState(6, Float32[0.511477 0.830376; 0.121374 0.0263919])
"foo9"     ModelState(9, Float32[0.707078 0.258221; 0.26533 0.219948])
"foo4"     ModelState(4, Float32[0.680079 0.92407; 0.874437 0.929336])

I think that might be inherently thread unsafe since all threads will have to mutate the same array. If it’s not an operation happening really fast, you could just use some locks to make sure it’s safe though.

Thanks, I’ll maybe just resort to a basic array. JuliaDB just came on my radar while blindly searching for thread-safe IO solutions. I was probably browsing the docs and took their description of the parallelized query support out of context.

push! is not thread-safe for a JuliaDB table or a regular Array, so neither will work by default. You can do some manual locking, or you could make a wrapper around the table/array that allows thread-safe push!. I’m sure there’s also packages out there that make this easier in some way, but I’m not aware of any.

So would this hold true even if there were no write-conflicts within threads? Such a case seems thread-safe since each thread could write to a 2-d array where the top level index is over Threads.threadid().

Even if the underlying array was large enough to accommodate all combined writes (which, if not, would most likely require a reallocation and move of the array data), the threads would race to increment the end-of-array index stored in the array object, so you’d end up with threads writing into the same array index, or various other kinds of undefined behavior.