Parallel Computing For Loop With Dictionaries

What’s the best way to implement parallel computing here? I’m new to it and Julia. I have eight physical cores.

The process function returns two dictionaries.

@time begin
param = "reaction" # reaction, mesh, diffusion
r_start = 0
r_end = 0.5
r_length = 10
parameter_range = range(r_start, r_end, length=r_length)
u_steps, speed_steps = process(param, parameter_range, ncores)
data = reduce(vcat, values(speed_steps))

Inside process, there is a for loop that iterates over parameter_range. This is where I can employ parallel computing as the order does not matter. At the end of each iteration, I add two results to two separate dictionaries.

function process(param, parameter_range, ncores)
  a = 0
  b = 20
  T = 10
  t_steps = 10^4
  global solutions = Dict()
  global speed = Dict()
  for key in parameter_range
    key = round(key, digits = 5)
    if param == "reaction"
      M = 512
      D = 1
      alpha = key
    end
    if param == "mesh"
      alpha = -0.5
      D = 1
      M = key
    end
    if param == "diffusion"
      alpha = 0.10
      M = 512
      D = key
    end

    k = T / t_steps
    h = (b - a) / M
    mu = k / h^2
    boundary = "HN"

    x = [h * i for i = 0:M]
    t = [k * i for i = 0:t_steps]

    # initial condition
    u0 = initial_condition(x)

    # crank matrices
    local B_inv, P = crank(M, mu, D, boundary)
    # time step
    local time_steps = calc_new_step(B_inv, P, u0, D, k, t_steps, M, alpha)
    p_title = string("reduced nagumo:", " ", "M = ", M, " ", "alpha = ", alpha, " ",
    "k = ", k, " ", "diff = ", D)
    Plots.contourf(x, t, time_steps, fill=true, c=:vik, title=p_title, xlabel="x", ylabel="t", dpi=300)

    # saves the current plot:
    global output_name = string(p_title, ".png")
    global subfolder = joinpath(pwd(), string(param))
    if ispath(subfolder) == 0
      mkdir(subfolder)
    end
    output_loc = joinpath(subfolder, output_name)
    savefig(output_loc)
    complete = DataFrame(Transpose(time_steps),:auto)
    solutions[key] = complete
    speed_key = calc_speed(complete, h, k, M, alpha, D, param, key)
    speed[key] = speed_key
  end
  return solutions, speed
end

The following doesn’t work.

using Base.Threads

@threads for key in parameter_range

I get the following error.

GKS: GKS: Specified workstation is open in routine OPEN_WS
GKS: GKS not in proper state. GKS must be either in the state WSOP or WSAC in routine ACTIVATE_WSGKS:
Specified workstation is open in routine OPEN_WS
Specified workstation is open in routine OPEN_WS
GKS: GKS not in proper state. GKS must be either in the state WSOP or WSAC in routine ACTIVATE_WS
GKS: GKS not in proper state. GKS must be either in the state WSOP or WSAC in routine ACTIVATE_WS
GKS: GKS not in proper state. GKS must be either in the state WSOP or WSAC in routine ACTIVATE_WS

signal (11): Segmentation fault
in expression starting at /home/wesley/repos/pde/juliatest_parallel.jl:148

signal (11): Segmentation fault
in expression starting at /home/wesley/repos/pde/juliatest_parallel.jl:148

signal (11): Segmentation fault
in expression starting at /home/wesley/repos/pde/juliatest_parallel.jl:148
set_clip_path at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/./svgplugin.so (unknown line)
set_clip_path at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/./svgplugin.so (unknown line)
gks_svgplugin at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/./svgplugin.so (unknown line)
gks_svgplugin at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/./svgplugin.so (unknown line)
unknown function (ip: 0x7fb141d6a99e)
gks_select_xform at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/libGR.so (unknown line)
gks_open_ws at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/libGR.so (unknown line)
set_clip_path at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/./svgplugin.so (unknown line)
gks_svgplugin at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/./svgplugin.so (unknown line)
initgks at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/libGR.so (unknown line)
initgks at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/libGR.so (unknown line)
gr_setcharheight at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/libGR.so (unknown line)
gr_setcharheight at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/libGR.so (unknown line)
setcharheight at /home/wesley/.julia/packages/GR/9Vi4m/src/GR.jl:1470
setcharheight at /home/wesley/.julia/packages/GR/9Vi4m/src/GR.jl:1470
unknown function (ip: 0x7fb14a927373)
unknown function (ip: 0x7fb14a927373)
jl_apply_generic at /opt/julia-1.6.0/bin/…/lib/julia/libjulia-internal.so.1 (unknown line)
#gr_set_font#351 at /home/wesley/.julia/packages/Plots/z5Msu/src/backends/gr.jl:390
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
jl_apply_generic at /buildworker/worke#gr_set_font#351 at /home/wesley/.julia/packages/Plots/z5Msu/src/backends/gr.jl:390
unknown function (ip: 0x7fb14a926db1)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlin_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
gr_set_font at /home/wesley/.julia/packages/Plots/z5Msu/src/backends/gr.jl:389
gr_set_font at /home/wesley/.julia/packages/Plots/z5Msu/src/backends/gr.jl:389
_update_min_padding! at /home/wesley/.julia/packages/Plots/z5Msu/src/backends/gr.jl:735
_update_min_padding! at /home/wesley/.julia/packages/Plots/z5Msu/src/backends/gr.jl:735
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
iterate at ./generator.jl:47 [inlined]
iterate at ./generator.jl:47 [inlined]
_collect at ./array.jl:691
_collect at ./array.jl:691
collect_similar at ./array.jl:606 [inlined]
map at ./abstractarray.jl:2294 [inlined]
_update_min_padding! at /home/wesley/.julia/packages/Plots/z5Msu/src/layouts.jl:282
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
prepare_output at /home/wesley/.julia/packages/Plots/z5Msu/src/plot.jl:189
collect_similar at ./array.jl:606 [inlined]
map at ./abstractarray.jl:2294 [inlined]
_update_min_padding! at /home/wesley/.julia/packages/Plots/z5Msu/src/layouts.jl:282
show at /home/wesley/.julia/packages/Plots/z5Msu/src/output.jl:214 [inlined]
png at /home/wesley/.julia/packages/Plots/z5Msu/src/output.jl:7
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
d]
prepare_output at /home/wesley/.julia/packages/Plots/z5Msu/src/plot.jl:189
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
savefig at /home/wesley/.julia/packages/Plots/z5Msu/src/output.jl:124
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
savefig at /home/wesley/.julia/packages/Plots/z5Msu/src/output.jl:129
macro expansion at /home/wesley/repos/pde/juliatest_parallel.jl:140 [inlined]
#13#threadsfor_fun at ./threadingconstructs.jl:81
show at /home/wesley/.julia/packages/Plots/z5Msu/src/output.jl:214 [inlined]
gks_ddlk.isra.0 at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/libGR.so (unknown line)
png at /home/wesley/.julia/packages/Plots/z5Msu/src/output.jl:7
gks_open_ws at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/libGR.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
savefig at /home/wesley/.julia/packages/Plots/z5Msu/src/output.jl:124
#13#threadsfor_fun at ./threadingconstructs.jl:48
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
savefig at /home/wesley/.julia/packages/Plots/z5Msu/src/output.jl:129
macro expansion at /home/wesley/repos/pde/juliatest_parallel.jl:140 [inlined]
#13#threadsfor_fun at ./threadingconstructs.jl:81
#13#threadsfor_fun at ./threadingconstructs.jl:48
unknown function (ip: 0x7fb16692333c)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
unknown function (ip: 0x7fb16692333c)
initgks at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/libGR.so (unknown line)
gr_setcharheight at /home/wesley/.julia/packages/GR/9Vi4m/src/…/deps/gr/lib/libGR.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.cunknown function (ip: 0x7fb14a927373)
unknown function (ip: 0x7fb14a927373)
/package_linux64/build/src/gf.c:2419
ed]
start_task at /buildworker/worker/package_linux64/build/src/task.c:839
unknown function (ip: (nil))
Allocations: 62183429 (Pool: 61684794; Big: 498635); GC: 52

I would recommend a thorough read of the performance tips to make sure your code is efficient on a single process before you try to make it parallel. Your code has a few issues (global untyped dictionaries are an immediate red flag) that, when fixed, might be able to deliver more than the ~8x speedup you’d get from parallelism.

Separately, the error you’re seeing is from the GR plotting library, which is an external C library and may not cooperate well with Julia’s multithreading. You might be better off with pmap.

1 Like

Thanks for the tip. I was wondering if typing would make much of a difference because the elements are matrices. I saw pmap mentioned in a few places but didn’t look it up.

Typing is one part of the problem, but the bigger issue is global - it makes it hard for the compiler to ensure that whatever variable it’s working on hasn’t been touched or modified by some other part of the program, which prevents most optimizations.

I initially declared the Dicts as local, but the code wouldn’t run until I declared it global. However, I just declared it as local and it ran just fine.

I misspoke (typed) earlier. The solutions and speed dictionaries have float keys and DataFrame elements. This seems to work, but I must be doing something wrong because I’m not seeing any performance gains.


solutions = Dict{Float64, DataFrame}()

Here’s my attempt at pmap.

param = "reaction" # reaction, mesh, diffusion
r_start = 0
r_end = 0.5
r_length = 10
parameter_range = range(r_start, r_end, length=r_length)
u_steps, speed_steps = pmap(process, param, parameter_range)
data = reduce(vcat, values(speed_steps))
@everywhere begin
function process(param, parameter_range)
  a = 0
  b = 20
  T = 10
  t_steps = 10^4
  local solutions = Dict{Float64, DataFrame}()
  local speed = Dict{Float64, DataFrame}()
  for key in parameter_range
    key = round(key, digits = 5)
    if param == "reaction"
      M = 512
      D = 1
      alpha = key
    end
    if param == "mesh"
      alpha = -0.5
      D = 1
      M = key
    end
    if param == "diffusion"
      alpha = 0.10
      M = 512
      D = key
    end

    k = T / t_steps
    h = (b - a) / M
    mu = k / h^2
    boundary = "HN"

    x = [h * i for i = 0:M]
    t = [k * i for i = 0:t_steps]

    # initial condition
    u0 = initial_condition(x)

    # crank matrices
    local B_inv, P = crank(M, mu, D, boundary)
    # time step
    local time_steps = calc_new_step(B_inv, P, u0, D, k, t_steps, M, alpha)
    p_title = string("reduced nagumo:", " ", "M = ", M, " ", "alpha = ", alpha, " ",
    "k = ", k, " ", "diff = ", D)
    Plots.contourf(x, t, time_steps, fill=true, c=:vik, title=p_title, xlabel="x", ylabel="t", dpi=300)

    # saves the current plot:
    global output_name = string(p_title, ".png")
    global subfolder = joinpath(pwd(), string(param))
    if ispath(subfolder) == 0
      mkdir(subfolder)
    end
    output_loc = joinpath(subfolder, output_name)
    savefig(output_loc)
    complete = DataFrame(Transpose(time_steps),:auto)
    solutions[key] = complete
    speed_key = calc_speed(complete, h, k, M, alpha, D, param, key)
    speed[key] = speed_key
  end
  return solutions, speed
end
end

I get the error message

LoadError: UndefVarError: M not defined

in expression starting at /home/wesley/repos/pde/juliatest_parallel.jl:150

(::Base.var"#837#839")(x::Task) at asyncmap.jl:177

foreach(f::Base.var"#837#839", itr::Vector{Any}) at abstractarray.jl:2141

maptwice(wrapped_f::Function, chnl::Channel{Any}, worker_tasks::Vector{Any}, c::Base.Iterators.Zip{Tuple{String, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}}}) at asyncmap.jl:177

wrap_n_exec_twice at asyncmap.jl:153 [inlined]

#async_usemap#822 at asyncmap.jl:103 [inlined]

async_usemap at asyncmap.jl:85 [inlined]

#asyncmap#821 at asyncmap.jl:81 [inlined]

asyncmap at asyncmap.jl:81 [inlined]

pmap(f::Function, p::WorkerPool, c::Base.Iterators.Zip{Tuple{String, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}}}; distributed::Bool, batch_size::Int64, on_error::Nothing, retry_delays::Vector{Any}, retry_check::Nothing) at pmap.jl:126

pmap(f::Function, p::WorkerPool, c::Base.Iterators.Zip{Tuple{String, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}}}) at pmap.jl:101

pmap(f::Function, c::Base.Iterators.Zip{Tuple{String, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}}}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) at pmap.jl:156

pmap at pmap.jl:156 [inlined]

pmap(f::Function, c1::String, c::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) at pmap.jl:157

pmap(f::Function, c1::String, c::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}}) at pmap.jl:157

macro expansion at juliatest_parallel.jl:156 [inlined]

top-level scope at timing.jl:210

eval at boot.jl:360 [inlined]

include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String) at loading.jl:1094

Before doing pmap, make sure your function works with map. You’ll probably need to rearrange your code a bit - map essentially takes the place of a for loop, and you should use it to replace the for key in parameter_range loop.

1 Like

I agree with stillyslalom. For parallelizing this kind of computation, making sequential map work is the best first step. You can then use various APIs for parallelizing this, as I noted in A quick introduction to data parallelism in Julia

Regarding the actual question in the OP, since the keys (parameter_range) are unique, there’s no point in using dictionaries as output. You can just use Dict(zip(parameter_range, the_result)) to get the dictionary.

Having said that, you can use Folds.jl to directly obtain the dictionary (which can be beneficial when hashing is the bottleneck and there are some overlaps in the keys):

julia> Folds.dict(x => x^2 for x in 1:3)
Dict{Int64, Int64} with 3 entries:
  2 => 4
  3 => 9
  1 => 1

Creating two dictionaries is a bit more involved but possible:

julia> using Transducers, BangBang, MicroCollections

julia> Folds.mapreduce(ProductRF(merge!!, merge!!), 1:3; init=(EmptyDict(), EmptyDict())) do x
           (SingletonDict(x => x^2), SingletonDict(x => x^2))
       end
(Dict(2 => 4, 3 => 9, 1 => 1), Dict(2 => 4, 3 => 9, 1 => 1))

Thanks for the replies.

I tried map and it says that M is not defined. M is defined in a conditional in the process function. Is there some incompatibility this way with map?

These issues are hard to diagnose unless you can provide code and a stacktrace. As a guess, are you still passing param to map? If so, you need to instead wrap those parameters in a closure. Here’s what’s probably happening (using println as a dummy function):

julia> map(println, "reaction", 1:10)
r1
e2
a3
c4
t5
i6
o7
n8

Since String is an iterable type in Julia, param is getting mapped as a series of characters, so it never satisfies any of your conditionals (and thus M, D, and alpha are never defined). The solution might look something like this.

function process(param, param_range)
    a = 0
    b = 20
    T = 10
    #... other setup stuff
    
    map(param_range) do key
        if param == "reaction"
            M = 512
            D = 1
            alpha = key
        elseif param == "mesh"
            alpha = -0.5
            D = 1
            M = key
        elseif param == "diffusion"
            alpha = 0.10
            M = 512
            D = key
        else
            @error "Invalid parameter $param"
        end

        #... other computations

        solutions[key] = complete
        speed[key] = complete
        return nothing
    end

    return solutions, speed
end

This is not very Julian - it should be reworked to avoid potential type instabilities in the definition of M, D, and alpha - but it should be enough to get you off the ground.

A better approach:

using Printf

function process(; M = 512, D = 1, α = 0.10)
    a = 0
    b = 20
    T = 10
    #... other setup stuff
    
    map(Iterators.product(M, D, α)) do (M, D, α)
        keystr = @sprintf("M = %0.5g, α = %0.5g, D = %0.5g", M, α, D)

        #... computations

        solutions[keystr] = complete
        speed[keystr] = complete
        return nothing
    end

    return solutions, speed
end

By setting the parameters as keyword arguments and using Iterators.product, you can call the function like this:

julia> process(; M = 2 .^ (5:10))
M = 32, α = 0.1, D = 1
M = 64, α = 0.1, D = 1
M = 128, α = 0.1, D = 1
M = 256, α = 0.1, D = 1
M = 512, α = 0.1, D = 1
M = 1024, α = 0.1, D = 1

…or this

julia> process(; α = [-0.5, 0.1], D = [1.0, 1.2])
M = 512, α = -0.5, D = 1
M = 512, α = -0.5, D = 1.2
M = 512, α = 0.1, D = 1
M = 512, α = 0.1, D = 1.2
2 Likes

Thanks. I got map to work and then I just changed it to pmap. What’s the best way to round the elements in parameter_range?

r_start = 0
r_end = 0.50
r_length = 100
parameter_range = range(r_start, r_end, length=r_length)
u_steps, speed_steps = process(; alpha = parameter_range, D = [1.0], M = [1024])
function process(; M = 512, D = 1, alpha = 0.10)
  a = 0
  b = 20
  T = 10
  t_steps = 10^4
  local solutions = Dict{String, DataFrame}()
  local speed = Dict{String, DataFrame}()
  #key = round(alpha, digits = 5)
  pmap(Iterators.product(M, D, alpha)) do (M, D, alpha)
        keystr = @sprintf("M = %0.5g, α = %0.5g, D = %0.5g", M, alpha, D)

Have you read the documentation of pmap? pmap won’t do you any good unless you use Distributed to addprocs first. Have you already done this step?

There are two parallel paradigms: Shared memory parallelism and embarrassingly/distributed parallelism.

The @threads and the associated Threads library use shared memory parallelism. Here you have to account for data races, deadlocks, and utilizing thread pools properly. This sort of parallelism is nicely suited for when you want to split your for loop over multiple threads as the “work” of each iteration is not that much.

For distributed parallelism, the addprocs function launches n independent Julia processes (think of this as you running julia n times in the terminal). Then pmap executes a function on all of these independent julia processes with the results collected at the end. The problem here is that you have to make sure all the variables, libraries, and functions are available on the remote processes as well, which is why you get the M is not defined on worker n errors.

In general you’d want to use pmap when the function being executed is “costly”, atleast most costly than the overhead of setting up the remote processes data communication. (Infact, if I recall correctly, the data is passed using ssh between the independent workers). What do your computations look like? If they are quick, then threading will probably be the best option here.

If you post your full code, we may be able to help you better (and possibly just speed your sequential code).

round has a number of keyword options to tweak according to your needs, but I can’t say what options will be suitable for your use case. You may want to set up rounding for M, D, and alpha individually, to account for the different characteristics (known only to you) of each parameter space.

1 Like

This should be accessible. The sequential version is juliatest. I added @everywhere and addprocs to the parallel version, but I get the error below

https://bitbucket.org/Forbeswa/computing/src/master/

LoadError: On worker 2:
LoadError: UndefVarError: @sprintf not defined
Stacktrace:
[1] top-level scope
@ :0
[2] eval
@ ./boot.jl:360
[3] #103
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/process_messages.jl:274
[4] run_work_thunk
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/process_messages.jl:63
[5] run_work_thunk
@ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/process_messages.jl:72
[6] #96
@ ./task.jl:406
in expression starting at /home/wesley/repos/computing/juliatest_parallel.jl:124
…and 7 more exceptions.
in expression starting at /home/wesley/repos/computing/juliatest_parallel.jl:114
sync_end(c::Channel{Any}) at task.jl:364
macro expansion at task.jl:383 [inlined]
remotecall_eval(m::Module, procs::Vector{Int64}, ex::Expr) at macros.jl:223
top-level scope at macros.jl:207
eval at boot.jl:360 [inlined]
include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String) at loading.jl:1094

@sprintf is from the Printf standard library - it’s an easy way to combine strings & variables rounded to a specified precision. I included it in the top of my earlier answer:

Yeah. It worked until I added @everywhere and addprocs().

@everywhere function process(; M = 512, D = 1, alpha = 0.10)
  a = 0
  b = 20
  T = 10
  t_steps = 10^4
  local solutions = Dict{String, DataFrame}()
  local speed = Dict{String, DataFrame}()
  #key = round(alpha, digits = 5)
  addprocs(8)
  pmap(Iterators.product(M, D, alpha)) do (M, D, alpha)
        keystr = @sprintf("M = %0.5g, α = %0.5g, D = %0.5g", M, alpha, D)

@everywhere using Printf?

I thought that I had to use @everywhere in conjunction with pmap.