I have a question about print(or println) functions inside a threaded loop. If I added print functions in a threaded loop, the julia would just keep running and have no output.
For example,
Threads.@threads for ii in range(1, N)
ithread = Threads.threadid()
println(ii)
end
Since my code needs several hours to finish, I really want to use print functions and know what is going on. Is there a way to solve this?
That was fun:
We can define a threadsafe print function.
Here done with a vector, but an IOBuffer might make more sense.
Which stores up pending prints,
until thread1 gets time to read them.
Only thread 1 is allowed to print.
using Base.Threads
const print_lock = SpinLock()
const prints_pending = Vector{String}()
function tprintln(str)
tid= Threads.threadid()
str = "[Thread $tid]: " * string(str)
lock(print_lock) do
push!(prints_pending, str)
if tid == 1 # Only first thread is allows to print
println.(prints_pending)
empty!(prints_pending)
end
end
end
Demo
julia> Threads.@threads for ii in 1:20
tprintln(ii) # use our new function
end
[Thread 1]: 1
[Thread 4]: 16
[Thread 1]: 2
[Thread 4]: 17
[Thread 2]: 6
[Thread 4]: 18
[Thread 1]: 3
[Thread 3]: 11
[Thread 4]: 19
[Thread 2]: 7
[Thread 1]: 4
[Thread 4]: 20
[Thread 2]: 8
[Thread 3]: 12
[Thread 1]: 5
Tested in 0.7 which segfaults if the wrong thread tries to print,
should work without change in 0.6
Thanks for your reply. This is helpful. But it seems the output is not complete. In your result, there are only 15 outputs rather than 20. Another thing is all the 15 outputs here are printed out together after the entire loop finished. I would be happy to see one output immediately after one cycle is completed.
In your result, there are only 15 outputs rather than 20.
Ah yes, because there are some left in the pending queue but thread1 is done with its printing.
Solution is the check anything left pending after the loop is done.
It should only be about 1 thing for each thread, if the threads are doing roughly equal work.
Another thing is all the 15 outputs here are printed out together after the entire loop finished.
I do not see this.
Perhaps in the small example, of just 20 thins with no delay between then it looked that way?
In the example below I’ve added some code to make them take some time.
and it is clear that it is happening every time thread1 runs.
not just at the end
Code
using Base.Threads
const print_lock = SpinLock()
const pending_prints = Vector{String}()
function print_pending()
@assert Threads.threadid() == 1
println.(pending_prints)
empty!(pending_prints)
end
function tprintln(str)
tid= Threads.threadid()
str = "[Thread $tid]: " * string(str)
lock(print_lock) do
push!(pending_prints, str)
if tid == 1 # Only first thread is allows to print
print_pending()
end
end
end
Demo
function busy_sleep(time) # Base.sleep is not threadsafe
start = now()
x=0
while(now()-start < Dates.Millisecond(time))
end
end
Threads.@threads for ii in 1:100
busy_sleep(50+ii)
tprintln(ii) # use our new function
end
print_pending() # get any prints that are still not done
Sure. But if it is just for providing some feedback from a long running computation, then it might be good enough. Also, if the loops take long to execute, then the chances of collisions is pretty slim.
Can use a lock with Core.println.
Which to me seems better than my earlier solution
using Base.Threads
const printlock = SpinLock()
Threads.@threads for ii in range(1, 20)
ithread = Threads.threadid()
lock(printlock) do
Core.println("a", "b", "c", ii)
end
end
Have you tried this? Base.println uses the libuv event queue, which is not thread safe, even with locks, last time I tried. (@oxinabox original approach is a neat way to work around that.)
Another possibility is to disassemble the @threads macro to only print on the main thread. Maybe you find my answer for multithreading and ProgressMeter updates useful (maybe you even want to use a ProgressMeter instead of just plain printing).