Again on reaching optimal parallel scaling

Interesting thread, I will read with interest, thx. :slight_smile:

Just be clear, this particular comment was referring to the sequential loops written with @floop, not the parallel loops. See the BlockVector speedup w.r.t iterate in [RFC/ANN] FLoops.jl: fast generic for loops (foldl for humans™) (Note: back then there was no parallel @floop). You can also manually write nested loop quite easily in this case anyway and you don’t need it to be generic over collection type. So probably I shouldn’t have shoehorned FLoops advertisement :slight_smile:

1 Like