Parallel Postprocessing

Hi everyone,

I am using Julia PyPlot to do postprocessing for simulation data. Here is the sample script for my parallel work:

using Distributed
@everywhere using PyCall
@everywhere matplotlib = pyimport("matplotlib")
@everywhere matplotlib.use("Agg")
@everywhere using PyPlot, Glob

@everywhere function process(filename::String, dir::String=".")
   np = pyimport("numpy");
   filehead, data, filelist = readdata(filename, dir=dir, verbose=false);
   # Postprocessing...
   plt.savefig("$(time).png")
   println("finished saving $(time)!")
end

# Define path and filenames
dir = ".";
filename = "y*.out";

# Find filenames
# ......

# Processing
@distributed for filename in filenames
    println("filename: $(filename)")
    process(filename, dir)
end

In this way, I find all the filenames on one processor, distribute the names within workers, and do the plotting and saving on each worker using @distributed. Since there’s no dependency between processing different files, this seems to work.

I am wondering if there are better ways to do this, say, using @threads or channel? Any idea is appreciated!

1 Like

For embarrassingly parallel tasks like this, this looks good to me. Simple and effective.

Later I encountered some issue when running this script in the command line but not in REPL. In REPL mode, everything looks fine, the plots are saved in png format. However, if I just type

julia process.jl

Then the function process is never been executed, and no error message returned. What’s wrong with that? I feel like the scheduled task in the queue is never been executed.

The driver script is sending the tasks off to workers and exiting immediately, because the distribute for loop returns immediately. This in turn kills the workers immediately. Doesn’t happen on the REPL because that stays alive after you run each command.

You need to have your driver script wait for the workers to all finish, by for example doing a dummy reduce.

1 Like

Makes sense! So what is the command I’m missing? Some synchronization, barrier() or just wait()?

The docs for @distributed say that if you give it a reducer it’ll wait for the workers, so that it can compute the reduction. So you can have each worker return something, say 1, and use + as the reducer.

Or just add @sync in front of @distributed.

1 Like

I created ThreadTools for this kind of tasks
https://github.com/baggepinnen/ThreadTools.jl

6 Likes

Is this for the upcoming 1.3 only? I would think thread pools is preferable in these kind of parallel IO work.

Yeah it only works for v1.3
I have a small benchmark in the Readme indicating roughly where the overhead caused by my implementation strategy starts becoming noticeable.

Can’t wait to try v1.3 and your package! Thanks for sharing!

Nice!

@baggepinnen your response promoted me to catch up on the multithreading news

If I understand the single global lock on libuv correctly, that means that interacting with it is done on a privileged thread, but the actual IO can happen in parallel? If not then for IO heavy tasks threads wouldn’t help…