Hi,
I am wondering how to apply multi-threading to nested for loops.
For the function nestedloop
below, I guess the threads will only be launched for the outermost k loop for OMP_JULIA_THREADS
times:
function nestedloops(nx, ny, nz)
state = ones(nx,ny,nz)
Threads.@threads for k = 1:nz
for j = 1:ny
for i = 1:nx
state[i,j,k] *= sin(i*j*k)
end
end
end
return
end
If I add more @thread
like the following:
function nestedloops2(nx, ny, nz)
state = ones(nx,ny,nz)
Threads.@threads for k = 1:nz
Threads.@threads for j = 1:ny
Threads.@threads for i = 1:nx
state[i,j,k] *= sin(i*j*k)
end
end
end
return
end
Will it launch OMP_JULIA_THREADS
^3 threads in total? At least I can see an obvious decrease in performance and significant amount of additional memory allocations.
If I write the nested loops in a more compact way:
function nestedloops3(nx, ny, nz)
state = ones(nx,ny,nz)
Threads.@threads for k = 1:nz, j = 1:ny, for i = 1:nx
state[i,j,k] *= sin(i*j*k)
end
return
end
This will return error:
ERROR: LoadError: syntax: invalid assignment location "k = 1:nz"
Is there a way like OpenMP that we can collapse the nested loops and apply simd, i.e., something like #pragma for collapse(3) simd
in C?