I’m a newbie in julia language, I’ve written a parallel code using Distributed, and I want to write the intermediate results of the program’s calculations to a file, this is my code.
# Set up parallel workers
# Compute the result
# Open the file in append mode
f = open("output.txt", "a")
# # Write the result to the file
println(f, "Result: ", result)
# open("output.txt","a",lock=true) do io
# write(io,"Result: "*string(result))
@distributed for i in 1:10000
result = my_function(i)
println("Result for iteration $i: $result")
I want get output.txt as follow:
but I got：
I can`t understand.
You are creating a race condition when multiple processes try to write to the same file simultaneously and end up overwriting one another.
You’ll either need to
- use a lock file to control access (so that only one process writes to at once), e.g. using Pidfile
- use a a different file format that supports parallel I/O (e.g. parallel HDF5, though that requires you to use MPI).
- send all of the data you want to output to a single process, and write from that.
The third option is usually the simplest and most robust.
Thank you very much, I will try the third option, the first two were a bit difficult for me.
In the documentation, I see this in the function open: “The
lock keyword argument controls whether the operation will be locked to ensure secure multithreaded access.” But it doesn’t seem to work. Thanks again.
Given that your output is pretty simple, I think a fourth way to do this would be with Logging and then use LoggingExtras.jl to pipe them to a file.
@zhpf0530 I hope this is said in a nice manner. You are doing distributed processing. Multithreaded applies to running on a single server.
@johnh Thanks. Multithreading is really good for running on a single server. I tried it, and the result was the same, and the output file was also messy. Following Steward’s suggestion, I now use channels to log the data, and then a separate process writes the data.
I think there are two mistakes regarding the use of
open in the code above (which is the default by the way):
- it’s meant for multithreading rather than multiprocess
- it’s meant for multiple accesses to the
io value returned by
The second is the biggest mistake: if you do
io = open(..., lock=true) in different threads then each thread will use its own
io object with its own lock, so the locks are useless. Instead, you should call
open only once, and share the
io between threads.
You’re not doing multi-threaded access. You’re doing multi-process access. Google processes vs threads.