You may not need to communicate during the computation, but each worker has separate, isolated memory. This is why you need @everywhere
to define functions on the workers. In your example, you are creating an anonymous function in the main process, using global variables (defined in your main process presumably), which cannot be accessed by the worker processes. In short, all variables defined in your anonymous function, and any functions called, must exist on the workers as well. If they don’t you can use pmap
itself to send the information to each of the workers.
In your example, it seems like the variable xi_no_draws
doesn’t exist on the workers. You could use @everywhere
to load it there, but you can also send it to the worker to avoid using global variables everywhere by passing it as an argument to pmap
.
An example using your variable names:
var_profit = pmap((markets, no_draws) -> retailer_var_profit_m(markets,dt,nu_alpha,nu_p,nu_xj,nu_xm,nu_xr,marketsize,no_draws), markets, xi_no_draws)
If you get a dimension mismatch error, you can create an “array” that is the same size as markets, but just fill it with the same value (which I do lazily using Iterators.repeat to avoid creating the entire array):
using Base.Iterators
var_profit = pmap((markets, no_draws) -> retailer_var_profit_m(markets,dt,nu_alpha,nu_p,nu_xj,nu_xm,nu_xr,marketsize,no_draws), markets, Iterators.repeat(xi_no_draws, length(markets)))
In essence, the arguments passed to the right of pmap
are sent to each of the workers (each worker gets a single element from the arrays for the given job), which is a good way to make sure they have access to the data they need.
A way of writing your code to send all of the arguments would be:
using Base.Iterators
additional_args = (dt,nu_alpha,nu_p,nu_xj,nu_xm,nu_xr,marketsize,xi_no_draws)
complete_arg_list = Iterators.product(markets, [additional_args])
var_profit = pmap(complete_arg_list) do args
market, extra_args = args # destructure to get the market and extra args
retailer_var_profit_m(market, extra_args...)
end
# If you want to reshape to same size asa `markets`:
var_profit = reshape(var_profit, size(markets)...)
You can use collect(complete_arg_list)
to see what the product
line is doing.
Again, this is much simpler to do with multithreading as all “workers” (the threads) have access to the same memory and you don’t need to worry about using @everywhere
, and it becomes a lot more ergonomic with the ThreadsX.jl
package, not to mention the reduction in latency (and possibly increase in speed). The only major issue is that poor performing code with lots of allocations will probably be worse off with multithreading than multiprocessing (i.e. using Distributed).