Correct way to populate an empty array in parallel in Julia?


Often times in parallel code, I find myself in need of populating an empty array. I have gone back and forth between using a vcat operation versus a SharedArray. A typical pseudo-code would look like:

function my_parallel_sharedarray()

 LL = SharedArray{Float64}(10, 1)
  @distributed for i in 1:10 
    LL[i] = #value
  return sum(LL)

function my_parallel_vcat()

  z = @distributed (vcat) for i in 1:10 
    LL = #value
 return sum(z)

What would be a preferred way to allocate an empty array (entry-by-entry, row-by-row or column-by-column) in parallel especially when the final array to be allocated in large (~1000 x 1 or 1000 * 100 etc.)? If one of the above ways is strictly preferred, could someone please point me to the downside of the other sub-optimal way, ideally with the help of a small toy example?

Is it required that you start with an empty array? If not, I suggest instead doing something like

using Distributed: pmap, addprocs


pmap([(i, j) for i in 1:10, j in 1:10]) do (i, j)
    (factorial(i) + factorial(j))/(factorial(i)*factorial(j))

which will create and fill the following matrix in parallel:

10×10 Array{Float64,2}:
 2.0      1.5       1.16667   1.04167    …  1.0          1.0        
 1.5      1.0       0.666667  0.541667      0.500003     0.5        
 1.16667  0.666667  0.333333  0.208333      0.166669     0.166667   
 1.04167  0.541667  0.208333  0.0833333     0.0416694    0.0416669  
 1.00833  0.508333  0.175     0.05          0.00833609   0.00833361 
 1.00139  0.501389  0.168056  0.0430556  …  0.00139164   0.00138916 
 1.0002   0.500198  0.166865  0.0418651     0.000201168  0.000198688
 1.00002  0.500025  0.166691  0.0416915     2.75573e-5   2.50772e-5 
 1.0      0.500003  0.166669  0.0416694     5.51146e-6   3.03131e-6 
 1.0      0.5       0.166667  0.0416669     3.03131e-6   5.51146e-7 
1 Like

Thanks for the very helpful suggestion! Yes, I would prefer to do it by first pre-allocating an empty array. The reason is that each value is tied to a unique id and if there are issues in a value (say, I get something like a NaN, -Inf or Inf), I can look at the id corresponding to that value and then look back at all the data linked to that id. It would be simpler to examine issues if I can easily track ids from values and the pre-allocated array is a very simple way to achieve that. My worry is that if vcat keeps creating a copy then the vcat approach will allocate unnecessarily for large matrices but I want to better understand what vcat would do relative to a pre-allocated SharedArray in parallel mode drawing on the experience of julians here!