Yes, we do see an overhead. If I remove all Threads.@threads
, we get the following timings for different numbers of DG elements for one right-hand side evaluation.
#Elements | Runtime in seconds
1 | 1.39e-06
4 | 1.66e-06
16 | 2.86e-06
64 | 7.70e-06
256 | 2.69e-05
1024 | 1.04e-04
4096 | 4.27e-04
16384 | 1.87e-03
65536 | 9.75e-03
If we use Threads.@threads
with a single thread as we do now, I get
#Elements | Runtime in seconds
1 | 1.86e-05
4 | 1.87e-05
16 | 2.03e-05
64 | 2.68e-05
256 | 4.62e-05
1024 | 1.25e-04
4096 | 4.44e-04
16384 | 1.93e-03
65536 | 9.67e-03
Thanks for this link!