julia> addprocs(4);
julia> pmap(x -> println(x), 1:8)
From worker 3: 4
From worker 5: 1
From worker 5: 5
From worker 3: 6
From worker 5: 7
From worker 3: 8
From worker 4: 3
From worker 2: 2
So on the main process, we see the “from worker” statements being printed. Now consider the following code using Slurm and ClusterManagers to connect to a multinode cluster.
You see nothing gets printed to the main process. Maybe a bug with ClusterManagers setting up the IO part of the processors? Or is it a bug from Julia’s side?
How do we debug this?
Note that in both results, pmap is working properly, i.e. it returns the output array (which in this case is nothing).
ClusterManagers redirects the output streams from the workers to individual files. Check the directory where you submitted your job from, you should have a bunch of job*.out files. These contain the output from the workers.
I think that’s *.out files are mostly for printing the host/port information for the workers. From what I can tell, it dosn’t redirect the standard out.
The host is the first line that is written out. The output from the workers is buffered and written out at a later point, possibly when the julia session exits. You should definitely have the outputs once the job is over, such has been my experience. I suppose it might be possible to force the output to appear earlier by explicitly flushing the buffer.
You can flush the buffer using flush(stdout). An example is:
julia> wait(@spawnat 2 println(myid()))
Future(2, 1, 8, nothing)
# output is not written out at this point
shell> cat job-1808884-0000.out
julia_worker:9088#10.0.1.25
julia> wait(@spawnat 2 flush(stdout))
Future(2, 1, 10, nothing)
# output has been written out
shell> cat job-1808884-0000.out
julia_worker:9088#10.0.1.25
2
Great thanks, this solves my problem. I think a ClusterManagers should support a keyword argument to where the io is printed. It’s extremely handing for debug purposes to have println statements in the worker code.