Hey, community! I am new in Julia and in multithreading/parallel computing. I am trying to clarify some things that I’ve read and to find the best way to solve my problem given my computational resources. My problem is: I would like to solve the same finite volume problem multiple times with different arguments faster in a single personal computer.
I have one function fv_strucgrid(Nx, Ny, Lx, Ly, bc, scr)
that solves a finite volume structured grid of sizes Lx
, Ly
, divided in Nx
, Ny
parts respectively, bc
is a boundary condition vector and scr
a source term function scrfun(x,y) = 2*cos.(x)'.*cos.(y)
.
My version of Julia is
julia> versioninfo()
Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-8.0.1 (ORCJIT, haswell)
Environment:
JULIA_NUM_THREADS = 4
and my computer infos are
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 69
Model name: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz
Stepping: 1
CPU MHz: 1788.265
CPU max MHz: 3100,0000
CPU min MHz: 800,0000
BogoMIPS: 5187.55
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 4096K
NUMA node0 CPU(s): 0-3
The four principal ways that I found to run my code in parallel are pmap
, @distributed for
, Threads.@threads for
and Threads.@spawn
inside a for loop. Here the examples:
pmap((args) -> fv_strucgrid(args...), [[320, 320, pi, pi, rand(2*320+2*320), scrfun] for i = 1:10000])
@distributed for i =1:10000
fv_strucgrid(320, 320, pi, pi, rand(2*320+2*320), scrfun);
end
Threads.@threads for i =1:10000
fv_strucgrid(320, 320, pi, pi, rand(2*320+2*320), scrfun);
end
@sync for i =1:10000
Threads.@spawn fv_strucgrid(320, 320, pi, pi, rand(2*320+2*320), scrfun);
end
The only thing that I change in each interaction is the boundary condition. So my question is: does anyone could explain me or send any material/link that could help me to understand the difference between each method and help me to choose the best option for my computer? Code examples are welcomed too Thanks!!