ChunkSplitters.jl: Simple parallel chunk splitter

lmiq · September 28, 2022, 5:51pm

I have just put up this small package that provides a chunks function to be used within threaded loops. It is something that I find useful and appears somewhat often here:

This is so basic that I have always the impression that it must be done already, perhaps even in Base as a type of Iterator. function.

Yet, for example, Iterators.partition is not what one wants, because the partition is not the most even possible:

julia> collect.(collect(Iterators.partition((1:7), 3)))
3-element Vector{Vector{Int64}}:
 [1, 2, 3]
 [4, 5, 6]
 [7]

vs.

julia> collect.(ChunkSplitters.chunks(1:7, i, 3) for i in 1:3)
3-element Vector{Vector{Int64}}:
 [1, 2, 3]
 [4, 5]
 [6, 7]

Thus, one question: Is there such simple functionality in Base or in a light-dependency package?

And, if not, I hope this small package is useful for others, and I may register it then.

aplavin · September 28, 2022, 6:05pm

Maybe this could be added to IterTools.jl? It already includes a partition function generalized in a differnet direction.

lmiq · September 28, 2022, 6:11pm

That seems a a good package to have that. I’ll try to make a PR.

ffevotte · September 28, 2022, 7:27pm

Yet another related package:

I don’t think it implements the splitting policies you want, but it targets very similar use cases.

lmiq · September 28, 2022, 8:01pm

Nice! By looking at IteratorTools I now implemented the splitting as an iterator (Julia is actually cool, isn’t it?).

It works like this:

julia> using ChunkSplitters 

julia> x = rand(7);

julia> Threads.@threads for (range,ichunk) in chunks(x, 3, :batch)
           @show (range, ichunk)
       end
(range, ichunk) = (6:7, 3)
(range, ichunk) = (1:3, 1)
(range, ichunk) = (4:5, 2)

Such that we can do, slightly more cleanly:

julia> using ChunkSplitters

julia> function sum_parallel(f, x; nchunks=Threads.nthreads())
           s = fill(zero(eltype(x)), nchunks)
           Threads.@threads for (range, ichunk) in chunks(x, nchunks)
               for i in range
                  s[ichunk] += f(x[i])
              end
           end
           return sum(s)
       end
sum_parallel (generic function with 1 methods)

julia> x = rand(10^7);

julia> Threads.nthreads()
12

julia> @btime sum(x -> log(x)^7, $x)
  115.026 ms (0 allocations: 0 bytes)
-5.062317099586189e10

julia> @btime sum_parallel(x -> log(x)^7, $x; nchunks=128)
  19.210 ms (74 allocations: 7.58 KiB)
-5.062317099585973e10

That performs nicely, for example in comparison with:

julia> @btime ThreadsX.sum(x -> log(x)^7, $x)
  18.127 ms (1103 allocations: 74.14 KiB)

of course in this case there is no reason not to use ThreadsX, but that splitter allows for more customizable threading problems.

lmiq · September 29, 2022, 4:38pm

I’ve made a pull request to IterTools: chunks by lmiq · Pull Request #95 · JuliaCollections/IterTools.jl · GitHub

And I may be registering the package anyways, so I can experiment more with the functionality (also I don’t know when or even if the PR will be accepted).

lmiq · April 20, 2023, 7:53pm

Finally this package, although simple, turned out to be useful, and more people are using it. Thus, a 1.0.0 version was released recently, and now I’ve put up a small but decent docs page:

https://m3g.github.io/ChunkSplitters.jl

lmiq · January 25, 2024, 1:13pm

We released now a new version of ChunkSplitters.jl (v2.1.0) with a new and breaking interface (although the release itself is not breaking because the previous interface is still functional, although no longer documented):

Now, a typical use of chunks is of the form:

julia> x = rand(7);

julia> @threads for inds in chunks(x; n=3, split=:batch)
           @show inds
       end
inds = 1:1:3
inds = 4:1:5
inds = 6:1:7

and it returns only the indices of the array of each chunk. Additionally, the number of chunks is
set by a keyword parameter, n, and the type of splitting is set by the optional split keyword.

To retrieve the chunk index (to avoid using threadid()) enumerate should be used:

julia> @threads for (ichunk, inds) in enumerate(chunks(x; n=3))
           @show ichunk, inds
       end
(ichunk, inds) = (1, 1:1:3)
(ichunk, inds) = (2, 4:1:5)
(ichunk, inds) = (3, 6:1:7)

This last output is equivalent to the previous interface with a call of the form for (ichunk, inds) in chunks(x, 3). This last form is still supported, but will be deprecated in version 3.0 of the package.

This change makes many codes cleaner when they do not use the chunk index, and also is consistent with the general Julia syntax of using enumerate to retrieve the indices of the counter of the iteration.

Important: the new interface is currently the only documented one.

j-fu · January 25, 2024, 1:41pm

I was using the ranges returned by chunks as input to a constructor of iterators, this doesn’t seem to be possible with the new interface - so may be the old one still can be kept documented ? This is the code:

github.com

j-fu/GridVisualize.jl/blob/048224278ed738aee0672147c1ee53a6f462ef7f/src/griditerator.jl#L10


      
          struct LinearSimplices{D,Tc,Ti,Tf} <: LinearSimplexIterator{D}
              coord::Matrix{Tc}
              cellnodes::Matrix{Ti}
              values::Vector{Tf}
              gridscale::Tc
              range::StepRange{Int,Int}
              ichunk::Int
          end
          
          function LinearSimplices(coord::Matrix{Tc},cn::Matrix{Ti},f::Vector{Tf};gridscale=1.0, nthreads=Threads.nthreads()) where {Tc,Ti,Tf}
              ncells=size(cn,2)
              dim=size(coord,1)
              map(chunks(1:ncells,nthreads)) do c
          	LinearSimplices{dim,Tc,Ti,Tf}(coord,cn,f,gridscale,c...)
              end
          end	
          
          function LinearSimplices(g::ExtendableGrid,f::Vector;nthreads=Threads.nthreads(), gridscale=1.0)
              LinearSimplices(g[Coordinates],g[CellNodes],f;nthreads,gridscale)
          end

EDIT: It is planned that the iterator can run locally over each simplex, so length(chunk) != length(iterator) (the later not known a priori). Here is the parallel loop: GridVisualizeTools.jl/src/marching_iterators.jl at e0acfcee9ca515625ff89efdd8292d550910b6ee · j-fu/GridVisualizeTools.jl · GitHub

lmiq · January 25, 2024, 1:57pm

In versions 2.X which is the current one, that still works. The new interface for that case is:

julia> map(enumerate(chunks(1:7; n=3))) do c
          c
       end
3-element Vector{Tuple{Int64, StepRange{Int64, Int64}}}:
 (1, 1:1:3)
 (2, 4:1:5)
 (3, 6:1:7)

You have to add the enumerate there, and use n = nthreads in your case.

For the time being we have no reason to release the 3.0 version, but if it is not much trouble, I would recommend updating to the new interface.

j-fu · January 25, 2024, 2:02pm

Ok, cool thanks!

Topic		Replies	Views
Split vector into N (potentially unequal length) subvectors? General Usage	9	4048	March 3, 2023
Parallelizing over an iterable Performance question , parallel	18	940	January 29, 2024
Shorthand for ((i-1)k+1):(ik) General Usage question , indexing	5	217	November 18, 2025
No more threadid indexing? [thread-local storage] Performance multithreading , threads	36	1580	August 15, 2025
Why has simple threading using `threadid` become so complex in v1.12...? Performance	39	1863	August 15, 2025

ChunkSplitters.jl: Simple parallel chunk splitter

Important: the new interface is currently the only documented one.

Related topics