I am trying to use @spawn to parallelize a computation, but I am experiencing some problems which I do not understand. Here is the simplest code with which I could reproduce the issue:
using Dates
function f()
s = 0.
for i in 1:10^10
s += rand()
end
return s
end
function t()
for i in 1:Threads.nthreads()
Threads.@spawn f()
println(i," ",Dates.now())
end
println("end")
end
println(Threads.nthreads())
t()
The output is, of course:
4
1 2020-08-08T20:41:15.736
2 2020-08-08T20:41:15.851
3 2020-08-08T20:41:55.117
4 2020-08-08T20:41:55.117
end
What I expected, however, is that at least nthreads-1
dates were printed almost instantaneously (-1
because perhaps there is one thread being used by the main interpreter, that is part of the question).
However, I do not get that. I get that dates take a while to be printed, in such a way that it appears that the execution is waiting for the spawned tasks to finish to continue in the loop.
I recorded a video of that behavior, which is available here. You will see that the date of the second thread takes a long time to be printed (even though the “2” is printed). Perhaps this is just a buffer thing, but my problem is that the program seems to be taking much longer than it should if the threads were actually being scheduled and the computations done in parallel.
https://drive.google.com/file/d/1yJUKA1gCGga7-R2r-s_7VwpTIzL5VBPh/view?usp=sharing
It seems that the number of free threads varies, sometimes threads are run 2 at a time, sometimes 4. It is not clear to me what to expect here. The same code if run again has a different behavior each time.