Hey, community! I am new in Julia and in multithreading/parallel computing. I am trying to clarify some things that I’ve read and to find the best way to solve my problem given my computational resources. My problem is: I would like to solve the same finite volume problem multiple times with different arguments faster in a single personal computer.
I have one function
fv_strucgrid(Nx, Ny, Lx, Ly, bc, scr) that solves a finite volume structured grid of sizes
Ly, divided in
Ny parts respectively,
bc is a boundary condition vector and
scr a source term function
scrfun(x,y) = 2*cos.(x)'.*cos.(y) .
My version of Julia is
julia> versioninfo() Julia Version 1.4.1 Commit 381693d3df* (2020-04-14 17:20 UTC) Platform Info: OS: Linux (x86_64-pc-linux-gnu) CPU: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-8.0.1 (ORCJIT, haswell) Environment: JULIA_NUM_THREADS = 4
and my computer infos are
$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 69 Model name: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz Stepping: 1 CPU MHz: 1788.265 CPU max MHz: 3100,0000 CPU min MHz: 800,0000 BogoMIPS: 5187.55 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 4096K NUMA node0 CPU(s): 0-3
The four principal ways that I found to run my code in parallel are
Threads.@threads for and
Threads.@spawn inside a for loop. Here the examples:
pmap((args) -> fv_strucgrid(args...), [[320, 320, pi, pi, rand(2*320+2*320), scrfun] for i = 1:10000])
@distributed for i =1:10000 fv_strucgrid(320, 320, pi, pi, rand(2*320+2*320), scrfun); end
Threads.@threads for i =1:10000 fv_strucgrid(320, 320, pi, pi, rand(2*320+2*320), scrfun); end
@sync for i =1:10000 Threads.@spawn fv_strucgrid(320, 320, pi, pi, rand(2*320+2*320), scrfun); end
The only thing that I change in each interaction is the boundary condition. So my question is: does anyone could explain me or send any material/link that could help me to understand the difference between each method and help me to choose the best option for my computer? Code examples are welcomed too Thanks!!