I’ve implemented load-balancing threaded parallel loops in FLoops.jl which can also use a wide class of scheduling policies depending on your needs (plus other things like distributed and GPU -based parallel loops and reductions).
3 Likes