JLD on Multithreading

jld
hdf5
multithreading

#1

I am calculating matrix entries through multithreading and now would like to save the matrix to a file because it gets too big for my RAM.

I read that the package JLD is most common for saving Julia variables.

But when I am now trying to save the multi threaded entries to a file:

using JLD
function test()
	save("test2.jld", "test", 1) #create file
	f = jldopen("test2.jld", "r+")
	Threads.@threads for i = 1:10
		for j = 1:10
			x = i + j
			write(f, "$i $j", x)
		end
	end
end

I get lots of errors, the first of them is:

Exception: EXCEPTION_ACCESS_VIOLATION at 0x6dad7bb7 – HDF5-DIAG: Error detected in HDF5 (1.8.13) at 0x6dad7bb7

I also tried to create a file for each matrix entry, but that did not work neither.

Does anyone know how to fix this or can recommend another way to save the matrix?

Thanks in advance


#2

Writing in parallel to a single file is always dangerous, because those writes must be properly coordinated. I’m not sure if the Julia/JLD implementation does that. To be on the safe side, I suggest to serialize the writes. To do so, at the start of your function create a lock:

sl = Threads.SpinLock()

And then surround the write call with calls to lock and unlock:

lock(sl)
write(...)
unlock(sl)

In production code the call to write should probably be in a try block and the unlock call in a finally clause.


#3

Okay, that works perfectly, thank you very much!