I am trying to solve a big PDE, and I want to parallelize a finite difference. I am seeking some advice on how to do this.
At some point I do the following
# struct that contains the parameters
pde = PDEPackage.PDE()
# compute difference
sol1 = rand(100); sol2 = rand(100)
dsol = pde(sol1) - pde(sol2)
The computation of
pde(sol) takes minutes. Hence, I am considering using distributed computing. One issue I have is that
pde is a callable struct with overwritten internal field upon call
pde(sol). So I wil need to duplicate the internal data.
I am considering doing the following
pde1 = pde
pde2 = deepcopy(pde)
data = [(pde1, sol1), (pde2, sol2)]
result = pmap(x -> x(x), data)
dsol = result - result
Does anyone has better suggestion?
Threaded Assembly · Ferrite.jl has a small example of a parallel FEM computation. And each thread has its own “
ScratchValues” which is data that the thread owns and can mutate freely: Threaded Assembly · Ferrite.jl. So it is quite similar to what you consider.
Here some course notes to do finite-diff on the GPU (or threaded CPU): GitHub - luraess/geo-hpc-course: Parallel CPU and GPU high-performance computing - short course . Maybe good for inspiration.
Thank you for your suggestions!
It took me some time to analyse your answers. I forgot to mention that I am targeting distributed computing because each
pde is threaded. I guess I will have to modify @kristoffer.carlsson suggestion.
Just FYI, the idiom
scratches[Threads.threadid()] only works with
Threads.@threads for (i.e., not with
Threads.@spawn) and it also relies on the implementation detail of
Threads.@threads. It’s possible that this idiom stops working at some point (though maybe we’d need to keep it as-is, given that many programs rely on it). I discussed this a bit in FAQ · FLoops
For new library functions, I suggest avoiding this idiom.
From what I understand, you will be able to set the scheduler to
static “which creates one task per thread and divides the iterations equally among them.”.
Yes, that’s why said library code. Library functions cannot use
:static since they can be called from any task. I think it’s OK to use
scratches[Threads.threadid()] application/script code since you can control how the functions using
@threads :static are called. But, I think it also makes sense to avoid
scratches[Threads.threadid()] even in script code since
@threads :static makes it impossible to nest your function into parallel higher-order functions (e.g., you may want to parallel map over some inputs using your parallel function).