ChunkSplitters.jl: Simple parallel chunk splitter

I have just put up this small package that provides a chunks function to be used within threaded loops. It is something that I find useful and appears somewhat often here:

This is so basic that I have always the impression that it must be done already, perhaps even in Base as a type of Iterator. function.

Yet, for example, Iterators.partition is not what one wants, because the partition is not the most even possible:

julia> collect.(collect(Iterators.partition((1:7), 3)))
3-element Vector{Vector{Int64}}:
 [1, 2, 3]
 [4, 5, 6]


julia> collect.(ChunkSplitters.chunks(1:7, i, 3) for i in 1:3)
3-element Vector{Vector{Int64}}:
 [1, 2, 3]
 [4, 5]
 [6, 7]

Thus, one question: Is there such simple functionality in Base or in a light-dependency package?

And, if not, I hope this small package is useful for others, and I may register it then.


Maybe this could be added to IterTools.jl? It already includes a partition function generalized in a differnet direction.


That seems a a good package to have that. I’ll try to make a PR.

Yet another related package:

I don’t think it implements the splitting policies you want, but it targets very similar use cases.

1 Like

Nice! By looking at IteratorTools I now implemented the splitting as an iterator (Julia is actually cool, isn’t it?).

It works like this:

julia> using ChunkSplitters 

julia> x = rand(7);

julia> Threads.@threads for (range,ichunk) in chunks(x, 3, :batch)
           @show (range, ichunk)
(range, ichunk) = (6:7, 3)
(range, ichunk) = (1:3, 1)
(range, ichunk) = (4:5, 2)

Such that we can do, slightly more cleanly:

julia> using ChunkSplitters

julia> function sum_parallel(f, x; nchunks=Threads.nthreads())
           s = fill(zero(eltype(x)), nchunks)
           Threads.@threads for (range, ichunk) in chunks(x, nchunks)
               for i in range
                  s[ichunk] += f(x[i])
           return sum(s)
sum_parallel (generic function with 1 methods)

julia> x = rand(10^7);

julia> Threads.nthreads()

julia> @btime sum(x -> log(x)^7, $x)
  115.026 ms (0 allocations: 0 bytes)

julia> @btime sum_parallel(x -> log(x)^7, $x; nchunks=128)
  19.210 ms (74 allocations: 7.58 KiB)

That performs nicely, for example in comparison with:

julia> @btime ThreadsX.sum(x -> log(x)^7, $x)
  18.127 ms (1103 allocations: 74.14 KiB)

of course in this case there is no reason not to use ThreadsX, but that splitter allows for more customizable threading problems.

1 Like

I’ve made a pull request to IterTools: chunks by lmiq · Pull Request #95 · JuliaCollections/IterTools.jl · GitHub

And I may be registering the package anyways, so I can experiment more with the functionality (also I don’t know when or even if the PR will be accepted).

Finally this package, although simple, turned out to be useful, and more people are using it. Thus, a 1.0.0 version was released recently, and now I’ve put up a small but decent docs page:


We released now a new version of ChunkSplitters.jl (v2.1.0) with a new and breaking interface (although the release itself is not breaking because the previous interface is still functional, although no longer documented):

Now, a typical use of chunks is of the form:

julia> x = rand(7);

julia> @threads for inds in chunks(x; n=3, split=:batch)
           @show inds
inds = 1:1:3
inds = 4:1:5
inds = 6:1:7

and it returns only the indices of the array of each chunk. Additionally, the number of chunks is
set by a keyword parameter, n, and the type of splitting is set by the optional split keyword.

To retrieve the chunk index (to avoid using threadid()) enumerate should be used:

julia> @threads for (ichunk, inds) in enumerate(chunks(x; n=3))
           @show ichunk, inds
(ichunk, inds) = (1, 1:1:3)
(ichunk, inds) = (2, 4:1:5)
(ichunk, inds) = (3, 6:1:7)

This last output is equivalent to the previous interface with a call of the form for (ichunk, inds) in chunks(x, 3). This last form is still supported, but will be deprecated in version 3.0 of the package.

This change makes many codes cleaner when they do not use the chunk index, and also is consistent with the general Julia syntax of using enumerate to retrieve the indices of the counter of the iteration.

Important: the new interface is currently the only documented one.


I was using the ranges returned by chunks as input to a constructor of iterators, this doesn’t seem to be possible with the new interface - so may be the old one still can be kept documented ? This is the code:

EDIT: It is planned that the iterator can run locally over each simplex, so length(chunk) != length(iterator) (the later not known a priori). Here is the parallel loop: GridVisualizeTools.jl/src/marching_iterators.jl at e0acfcee9ca515625ff89efdd8292d550910b6ee · j-fu/GridVisualizeTools.jl · GitHub

In versions 2.X which is the current one, that still works. The new interface for that case is:

julia> map(enumerate(chunks(1:7; n=3))) do c
3-element Vector{Tuple{Int64, StepRange{Int64, Int64}}}:
 (1, 1:1:3)
 (2, 4:1:5)
 (3, 6:1:7)

You have to add the enumerate there, and use n = nthreads in your case.

For the time being we have no reason to release the 3.0 version, but if it is not much trouble, I would recommend updating to the new interface.


Ok, cool thanks!

1 Like