I want to multiply a series of matrices by a vector, and it looks like @spawn is the fastest multithreading process. Unfortunately, the below MWE only returns Task Done.
x = collect(1:1:10)
y = x .+ x
vecmat = [[x+y for x in x, y in y] for i=1:length(x)]
newvecs = @spawn map(m -> m * x, vecmat)
The below works, but seems non-Julian.
x = collect(1:1:10)
y = x .+ x
vecmat = [[x+y for x in x, y in y] for i=1:length(x)]
newvecs = []
@spawn push!(newvecs, map(m -> m * x, vecmat))
Any thoughts on making @spawn and map return the value instead of Task Done in a Julian way?
Or an alternative solution that’s just as fast?
@spawn is asynchronous and thus cannot immediately return a useful result. Instead, it spawns a new task doing the work and returns a handle to the task which can continue running.
In order to retrieve the result you can use fetch which blocks/waits until the task is done and fetches the result, i.e.,
If you want parallel (multiprocess) map, check out pmap (discussed in this section of the manual). It provides a nice wrapper around this pattern that allows you to have some control of scheduling, batching, etc.
There’s also asyncmap which spawns LOCAL tasks (e.g., green threads) on the calling process, and is good for things like IO/network access where you want to make many requests that have “downtime” between initiation and completion where julia itself isn’t doing anything (e.g., load a bunch of small objects from s3).
And finally, there are a whole family of packages built on Transducers.jl which provide various abstractions for things like this that can be executed in parallel that are “execution engine agnostic” (can be done locally in one process, with multithreading, or with multiprocessing).