I am running simulations in parallel processes that take many hours to run, so I print the number of replications to keep track of the progress. Here is a MWE:
# test.jl
using Distributed
addprocs(2)
@everywhere function simulate(n)
for i in 1:n
sleep(1)
println(i)
# flush(stdout)
end
return 0
end
function simulate_parallel(n)
wrkrs = workers()
nwrkrs = length(wrkrs)
futures = Vector{Future}(undef,nwrkrs)
for (i,w) in enumerate(wrkrs)
futures[i] = @spawnat w simulate(n/nwrkrs)
end
results = Vector{Int}(undef,nwrkrs)
for i in 1:nwrkrs
results[i] = fetch(futures[i])
end
return results
end
@time simulate_parallel(30)
So far everything works as expected if I run the file in the REPL or as a script. Sometimes I run a batch of simulations overnight and redirect the output to log files because I want to see how long it took to run each file and if any errors occurred. So I run a batch file that looks like this (I am using Windows):
julia test.jl > log1.txt 2>&1
julia test.jl > log2.txt 2>&1
The 2>&1
means that both stdout and stderr are redirected to the log file.
The problem I have is that the output does not flush to the file (flush(stdout)
has no effect in my simulate
function), so I have no idea how far along the simulation is for hours. Is there anyway to force flush every time a workers prints?