I’ve ported some code from a Cython implementation that may be found here into Julia. This code is parallelized and currently runs ~0.4x faster on my machine compared to my best Julia implementation. The function that is mostly called can be found here . The routine has a for loop in line 46 that is…

Parallelizing for loop in the computation of a gradient

saschatimme February 22, 2018, 10:21am 18

Yeah but so I can add an additional function barrier with the explicit kernel function and even if some types are messed up due to the closure bug, I only have to pay nthreads() times for the dynamic dispatch.

Innefficient paralellization? Need some help optimizing a simple dot product

Parallel is very slow

Topic		Replies	Views
Poor performance while multithreading (Julia 1.0) Performance multithreading	28	4071	February 11, 2019
Multithreading for nested loops Performance multithreading	42	12863	January 20, 2022
Innefficient paralellization? Need some help optimizing a simple dot product Performance question , parallel	32	4943	March 28, 2018
Slower with threads Performance question	26	1265	August 6, 2022
Slower @threads than serial for array computations Julia at Scale	26	2802	May 7, 2020

Parallelizing for loop in the computation of a gradient

Related topics