I am trying to parallelize a simple function (doTheTransit
in the MWE below). I have opened Julia in the terminal with -p 6
since I have a 6-Core processor.
Explanation of my MWE:
With the function createNecessary
I am constructing a vector of tuples, each tuple consists of a dictionary and a vector of indexes. In the function doTheTransit
I have an input parameter state and a tuple. Generally I want to compute a successor state to the input state. Here every element has a specific function, therefore a specific successor element, and the ith’s successor element is determined by certain elements in the existing state, these certain elements are idx
in the function createNecessary
. The dictionary created in the function createNecessary
are all the possible combinations of idx
(keys) and a successor element to the ith element (value).
I now want to use map to iterate over the vector of tuples and create the successor state by computing each successor element separately, with doTheTransit
.
This task seems perfect for parallelizing it, but using pmap
I have a lot more allocations and need a lot more time than using non-parallel map
. I have also included Folds.map
and ThreadsX.map
since these are also suppose to parallelize the given task, both need a lot less time compared to pmap
but are no faster than map
.
I am very surprised by this outcome, since I would have liked to reduce the runtime of my functions by parallelizing this computation.
Am I doing something wrong?
Or why does the runtime increase?
using Distributed, Folds, ThreadsX
@everywhere function createNecessary(n)
vectorOfTuple = Vector{Tuple{Dict{Vector{Int64}, Int64}, Vector{Int}}}(undef, n)
randi = rand(1:(n-1), n)
for i in 1:n
dicti = Dict{Vector{Int64}, Int64}()
idx = rand(1:n, randi[i])
for j in 0:((1<<randi[i])-1)
dicti[digits(j, base=2, pad= randi[i])] = rand(0:1)
end
vectorOfTuple[i] = (dicti,idx)
end
return vectorOfTuple
end
@everywhere function doAllTransitions(funcis, state)
println("Map:")
@time map(fun -> doTheTransit(fun, state), funcis)
println("Pmap:")
@time pmap(fun -> doTheTransit(fun, state), funcis)
println("Folds.map:")
@time Folds.map(fun -> doTheTransit(fun, state), funcis)
println("ThreadsX.map:")
@time ThreadsX.map(fun -> doTheTransit(fun, state), funcis)
end
@everywhere function doTheTransit(func, state)
tempi = state[func[2]]
return func[1][tempi]
end
doAllTransitions(createNecessary(25), rand(0:1, 25))
The @time
results were the following:
Map:
0.000758 seconds (26 allocations: 4.031 KiB)
Pmap:
810.314207 seconds (46.93 M allocations: 11.876 GiB, 27.62% gc time, 0.01% compilation time)
Folds.map:
0.076025 seconds (5.96 k allocations: 338.998 KiB, 81.34% compilation time)
ThreadsX.map:
0.090041 seconds (7.45 k allocations: 413.728 KiB, 97.14% compilation time)