Multithreading for nested loops

henry2004y · June 27, 2021, 12:25pm

Thanks! In my first attempt with Tullio, I did not import LoopVectorization. In fact, if we use avx macro for this micro example, without any threading we can immediately get 4x speedup. However, that is not what I want to test initially (and probably not a fair comparison?). avx looks like the simd clause in OpenMP, if I interpret it correctly.

Now with 2 threads and no avx, using Tullio gives me 1.5x speed up compared with base case, but @threads on the outer loop is slightly faster than that:

Number of threads = 2
base line:
  158.373 ms (2 allocations: 61.04 MiB)
@threads on the outer loop:
  99.353 ms (14 allocations: 61.04 MiB)
@tullio on the nested loops:
  109.440 ms (18 allocations: 61.04 MiB)
@tullio avx on the nested loops:
  34.556 ms (17 allocations: 61.04 MiB)

Topic		Replies	Views
Parallel for nested loop with inner loop first, and then outer loop General Usage	2	215	May 23, 2023
Limit number of threads from Threads.@threads General Usage multithreading	4	585	April 6, 2022
Threads parallelization on different nested loops Performance	0	332	June 21, 2021
Nested parallelization in Julia? General Usage	4	691	November 1, 2019
Multithreading for nested for loops General Usage parallel , multithreading , threads	13	1708	August 16, 2023

Multithreading for nested loops

Related topics