How to use multiple threads (julia > 1.3) in a for loop saving results in a vector

As I needed quite a few time to learn how to use the new @spawn in Julia >= 1.3, here is an example on how to use threads on a function that produces something, saving the results to a vector (where order doesn’t matter):

import Base.Threads.@spawn

struct myobj
    o
end

singleOp(obj,x,y) = (x .+ y) .* obj.o

function multipleOps(obj,xbatch,ybatch)
    #out = Array{Float64,1}[]
    out = Array{Array{Float64,1},1}(undef,size(xbatch,1))
    for i in 1:size(xbatch,1)
        #println(i)
        #push!(out,singleOp(obj,xbatch[i,:],ybatch[i,:]))
        out[i] = singleOp(obj,xbatch[i,:],ybatch[i,:])
    end
    return out
end

obj = myobj(2)
xbatch = [1 2 3; 4 5 6]
ybatch = [10 20 30; 40 50 60]

results = @spawn multipleOps(obj,xbatch,ybatch)
finalres = sum(fetch(results))

However the advantage in terms of time became interesting only for relatively computationally expensive operations:

using BenchmarkTools

xbatch = rand(32,50)
ybatch = rand(32,50)
@benchmark sum(fetch(@spawn multipleOps(obj,xbatch,ybatch))) #60 μs
@benchmark sum(multipleOps(obj,xbatch,ybatch)) #24 μs

xbatch = rand(32,50000)
ybatch = rand(32,50000)
@benchmark sum(fetch(@spawn multipleOps(obj,xbatch,ybatch))) # 58 ms
@benchmark sum(multipleOps(obj,xbatch,ybatch)) # 66 ms

(by the way I couldn’t find a way to use a preallocated array instead of pushing into it)