Best way to parallelize

You may want to limit the number of batches to the number of threads and increase the workload on each batch. This can be done by hand, but I have the impresion that FLoops.jl can handle that case nicely.

See: Sum result of Threads.foreach() - #10 by tkf