Advice for parallel computation

rveltz · April 27, 2021, 8:03am

Hi,

I am trying to solve a big PDE, and I want to parallelize a finite difference. I am seeking some advice on how to do this.

At some point I do the following

using PDEPackage
# struct that contains the parameters
pde = PDEPackage.PDE()
# compute difference
sol1 = rand(100); sol2 = rand(100)
dsol = pde(sol1) - pde(sol2)

The computation of pde(sol) takes minutes. Hence, I am considering using distributed computing. One issue I have is that pde is a callable struct with overwritten internal field upon call pde(sol). So I wil need to duplicate the internal data.

I am considering doing the following

pde1 = pde
pde2 = deepcopy(pde)
data = [(pde1, sol1), (pde2, sol2)]
result = pmap(x -> x[1](x[2]), data)
dsol = result[1] - result[2]

Does anyone has better suggestion?

kristoffer.carlsson · April 27, 2021, 8:16am

Threaded Assembly · Ferrite.jl has a small example of a parallel FEM computation. And each thread has its own “ScratchValues” which is data that the thread owns and can mutate freely: Threaded Assembly · Ferrite.jl. So it is quite similar to what you consider.

mauro3 · April 27, 2021, 10:02am

Here some course notes to do finite-diff on the GPU (or threaded CPU): https://github.com/luraess/geo-hpc-course . Maybe good for inspiration.

rveltz · April 27, 2021, 7:22pm

Thank you for your suggestions!

It took me some time to analyse your answers. I forgot to mention that I am targeting distributed computing because each pde is threaded. I guess I will have to modify @kristoffer.carlsson suggestion.

tkf · April 27, 2021, 7:46pm

Just FYI, the idiom scratches[Threads.threadid()] only works with Threads.@threads for (i.e., not with Threads.@spawn) and it also relies on the implementation detail of Threads.@threads. It’s possible that this idiom stops working at some point (though maybe we’d need to keep it as-is, given that many programs rely on it). I discussed this a bit in FAQ · FLoops

For new library functions, I suggest avoiding this idiom.

kristoffer.carlsson · April 27, 2021, 8:19pm

From what I understand, you will be able to set the scheduler to static “which creates one task per thread and divides the iterations equally among them.”.

tkf · April 27, 2021, 8:44pm

Yes, that’s why said library code. Library functions cannot use :static since they can be called from any task. I think it’s OK to use scratches[Threads.threadid()] application/script code since you can control how the functions using @threads :static are called. But, I think it also makes sense to avoid scratches[Threads.threadid()] even in script code since @threads :static makes it impossible to nest your function into parallel higher-order functions (e.g., you may want to parallel map over some inputs using your parallel function).

Topic		Replies	Views
Tutorials or examples on multithreading and distributed processing for numerical PDEs New to Julia	8	1206	June 2, 2021
Benchmarking Parallel Computing Tools General Usage multithreading , distributed	2	573	February 25, 2021
Looking for tips on parallelism for Differential Equations Problems New to Julia diffeq , parallel , multithreading	1	1057	May 22, 2020
Parallel assembly of a finite element sparse matrix General Usage parallel	34	1715	February 28, 2024
Help with parallel computing for a simple loop with a large function New to Julia	4	1648	October 25, 2019

Advice for parallel computation

Related topics