Improve performance of matrix computation

dfdx · April 24, 2018, 1:47pm

Ah, my fault, I forgot that @parallel without reduction is actually asynchronous and you need @sync @parallel. With syncing, however, data transfer becomes a bottleneck and parallel version takes even longer than serial version.

One option is to try threading instead: add threads to Julia as (in terminal)

export JULIA_NUM_THREADS=4

And changing @parallel to Threads.@threads and SharedArray to a normal one.

However, globally I’d make a bet on correct memory layout (column-first arrays) and broadcasting which can often parallelize operations automatically. My latest example f3 seems to be broken right now (sorry for that, I’ll try to fix it a later today), but you should get the general idea.

Topic		Replies	Views
Parallel implementation is not very effective General Usage question , performance , parallel , multithreading	19	889	October 8, 2022
Sum operations between arrays Performance	21	5766	April 7, 2020
Optimize code by parallelization/GPU Performance	8	627	October 12, 2022
Any suggestion for speeding up my function? Performance question	11	491	November 17, 2022
Speeding up a function Performance performance	43	1133	August 14, 2023

Improve performance of matrix computation

Related topics