# Best Practices for Parallel implementation of Gibbs Sampler algorithm

Would someone comment on how to improve my coding? In particular:

1. did I use @sync properly?
2. Did I use Distributed arrays efficiently?
3. How does random number generation work on parallel cores? Should each core be started with its own seed?
``````using Gibbs:ConvertToNumber,ConvertToLetter,CalcFreq
@everywhere using Gibbs:GibbsSampler,CalcRelEntropy

@everywhere using DistributedArrays

@everywhere using Distributions

ks = ARGS
Ns = ARGS
const k = parse(Int,ks)
const N = parse(Int,Ns)

const t = length(Dna)

const Dvec = map(ConvertToNumber, Dna);
const lStrand = length(Dvec);
const bFreq = sum(CalcFreq(Dvec),dims=2) / (t*lStrand);

#const MvecInit = map(ConvertToNumber, Motifs);
#Motifs = nothing

MvecInit = []

dBestScores = distribute(zeros(nworkers()))

mMinit = Array{Array{Int64,1},1}[]
for i=1:nworkers()
Mvec = fill(zeros(Int,k), t)
push!(mMinit,Mvec)
end

dBestMvec = distribute(mMinit)

mMinit = nothing

@sync @distributed for i = 1:5760
Mvec = GibbsSampler(Dvec,MvecInit,k,t,N)
score = CalcRelEntropy(Mvec,bFreq)
if (score > dBestScores[:L])
dBestScores[:L] = score
dBestMvec[:L] = Mvec
end
end

BestScores = convert(Array, dBestScores)
BestMvecs = convert(Array, dBestMvec)

fScore = open("score.txt", "w")
println(fScore,BestScores)
close(fScore)

(bestScore,argmax) = findmax(BestScores)

BestMotifs = map(ConvertToLetter, BestMvecs[argmax])

for motif in BestMotifs
println(motif)
end

``````

Could my implementation of the calculation of base frequencies be optimized for speed? The vector lengths are short (less than a hundred).

``````function CalcFreq(Mvec)
t = length(Mvec);
k = length(Mvec);

Bin = zeros(Int,4,k);

for i =1:t
for j = 1:k
Bin[Mvec[i][j],j] += 1;
end
end

return Bin
end

``````