Hi everyone,
I am developing a bioinformatics package to estimate protein adaptation rate using analytical theory followed by an Approximate Bayesian Computation.
I have used a mutable struct to change some mutational features independently and output N independent estimations, which I used to perform ABC. Since N is nearly 10^6
I have explored distributed computing to perform such N independent estimation.
The problem is that I am performing the distributed computing using several copies of the mutable struct, which finally changes any value inside the function which performs the whole estimation.
Here I summed up the code which makes the estimation.
# param is the mutable struct
# N is the number of models to solve
# nTot, nLow, ngh, ngl, ngamNeg, afac, θ, ρ, are the arrays containing the values to solve the model
fac = rand(-2:0.05:2,N)
afac = @. 0.184*(2^fac)
nTot = rand(0.1:0.01:0.9,N)
lfac = rand(0.0:0.05:0.9,N)
nLow = @. nTot * lfac
ngh = rand(repeat(gH,N),N);
ngamNeg = rand(repeat(gamNeg,N),N);
θ = fill(theta,N)
ρ = fill(rho,N)
nParam = [param for i in 1:N];
# iterRates change param values nTot, nLow, ngh, ngl, ngamNeg, afac, θ, ρ values before performing the estimation
Distributed.pmap(iterRates, nParam, nBinom, nTot, nLow, ngh, ngl, ngamNeg, afac, θ, ρ);
I have never programmed using distributed computing before, but I am sure this is not the proper way to program it. Is there any other way to properly distribute my mutable struct and solve my model N times?
Thanks in advance!
Jesús