Thread- and process-based parallelisms in Transducers.jl (+ some news)

It’s been a while since I added parallelism supports in Transducers.jl but I’ve never announced this feature properly. I just added a few utility functions and a tutorial so I think it’s good timing to do this.

Quoting Overview of parallel processing in Transducers.jl:

Transducers.jl supports thread-based (reduce) and process-based (dreduce) parallelisms with the same composable API; i.e. transducers. Having a uniform API to cover different parallelisms as well as sequential processing foldl is useful. Using multiple cores or machines for your computation is as easy as replacing foldl with reduce or dreduce; you don’t need to re-write your transducers or reducing functions.

See also:

Thread-based parallelism

Transducers.jl supports thread-based parallelism for Julia ≥ 1.0. You can use it by replacing foldl with reduce. With Julia ≥ 1.3, Transducers.jl supports early termination to avoid unnecessary computation while guaranteeing the result to be deterministic; i.e., it does not depend on how computation tasks are scheduled.

Process-based parallelism

Transducers.jl supports process-based parallelism using Distributed.jl. You can use it by replacing foldl with dreduce. It can be used for horizontally scaling the computation. It is also useful for using external libraries that are not “thread-safe.”

Note that early termination is not supported in dreduce yet.

Misc news


Great work!


I forgot to mention this, but Example for the Depth first multithread implementation performance gain as a motivation reminded me that the early termination feature depends on that Julia scheduler being depth-first. The computed result is deterministic and scheduler independent. However, the depth-first scheduling makes it possible to terminate as early as possible by writing the reduction in divide-and-conquer approach. It makes the implementation very straightforward, if not trivial. A big thanks to Julia dev team!




Does Transducers.jl support nested threaded parallelism similar to raw @spawn? That is, in the [contrived] example

using Transducers

function f1(x)
    xs = x .+ rand(10000)
    return reduce(+, Map(sin), xs)

reduce(+, Map(f1), 1:10000)

Is threaded-parallelism used at both the top-level reduce and also within each f1?

Transducers.jl is implemented with @spawn in Julia >= 1.3 so it naturally supports nested parallelism as in your example. (But I think it will crash in Julia < 1.3.)