I would like to set up an MCMC workflow using Distributed.
I have a Julia script that does the following:
load all packages
load the data
run an MCMC chain, with index i, for i in 1:5
save the result in some_chain_$i.jld2
done! send the user a message.
I would like parallelize step 3 with Distributed. Is there a tutorial that would get me started? I have never used this package before, so sorry if not all questions make sense.
I am running everything on a single server which I fully control, so processes are local. Is it sufficient to just use addprocs(5) with the local manager?
Is it enough if I load packages using @everywhere?
For the core computation, can I just do something like this:
remotes = [remotecall(my_mcmc_runner, i, data, logdensity) for i in 1:5]
map(fetch, remotes)
Perhaps consider controlling the RNGs of your workers. I don’t think you can realistically have 2 of them start with the same seed but for reproducability it of course beneficial nonetheless.