Parallel/distributed k-means clustering replicates

First time trying parallel processing; I’d like to run replicates for the kmeans algorithm in parallel to find a global minimum.

Just calling kmeans on some data works fine:

using Clustering

n_data_points = 100000
data = rand(4,n_data_points)
n_clusters = 20
R = kmeans(data, n_clusters; maxiter=1000)

I tried using a parallel for loop, but get an error saying BoundsError: attempt to access 0-element Vector{Float64} at index [1].

using Distributed
using SharedArrays

n_reps = 32
cost_R = SharedArray{Float64}(n_reps)
asgn_R = SharedArray{Int64}(n_data_points, n_reps)

addprocs()
@everywhere using Pkg
@everywhere Pkg.activate("..")
@everywhere using Clustering

@sync @distributed for repᵢ ∈ 1:n_reps
    R = kmeans(data, n_clusters; maxiter=1000)
    cost_R[repᵢ] = sum(R.costs)
    asgn_R[:,repᵢ] = assignments(R)
end

If I exclude addprocs() and run the loop, it executes alright.

Putting addprocs() before the SharedArrays solves the issue.

(Thanks Krystian Gulinski on Slack)