Hi,
I am relatvely new to Julia and I am having some weird issue when trying to parallelize.
I use the @parallel with Shared Arrays approach (but I have the same issue when I use pmap ).
So, I have something like this:
addprocs(7)
A=SharedArray{Float64}(a_size,b_size,c_size)
toler=1e-3
maxit=1000
 while (metric1>toler) && (iter1<maxit)
`@inbounds` `@sync` `@parallel`  for i in 1:c_size
 A[:,:,i]=compute_A(fs,A0[:,:,i],i)
end
A_new=sdata(A)
metric1=maximum(abs.((A_new-A0)))
A0=copy(A_new)
iter1=iter1+1
println("$(iter1)  $(metric1)")
end
where the inputs of the function “compute_A” are:
-fs is a “DataType” defined by me
-A0 is an array
-i is the index I m looping over (dimension c_size)
So, it seems to be working fine when I compute it like this.
The problem is that when I wrap this up in a function
like:
addprocs(7)
myfunction(fs::DataType,  toler::Float64, maxit::Int)
 A=SharedArray{Float64}(a_size,b_size,c_size)
 
  while (metric1>toler) && (iter1<maxit)
 `@inbounds` `@sync` `@parallel`  for i in 1:c_size
  A[:,:,i]=compute_A(fs,A0[:,:,i],i)
 end
 A_new=sdata(A)
 metric1=maximum(abs.((A_new-A0)))
 A0=copy(A_new)
 iter1=iter1+1
 println("$(iter1)  $(metric1)")
 end
end 
This: wrap(fs, 1e-3, 1000)
runs WAY SLOWER than the other one (like 6 vs 600 seconds). It seems extremely weird and
I don’t understand what I am doing wrong but there is definitely something I am missing. So I was hoping I could get some help here. Thanks a lot for your time and help.