Running a function in parallel

I need help getting a function to run in parallel. I want to distribute both data and packages to all workers, see the following example (which doesn’t work):

#Simulated data (n observations, p variables, tr truevariables, sig error)
n, p, q, tr_p1, tr_p2, sig = 1000, 5000, 2, 10, 5, 0.5
X = randn(n, p)
Y = randn(n, q)
B_true1 = [randn(tr_p1)..., zeros(p-tr_p1)...]
B_true2 = [randn(tr_p2)..., zeros(p-tr_p2)...]
y1 = X*B_true1 + sig*randn(n)
y2 = X*B_true2 + sig*randn(n)
Y[:,1] = y1
Y[:,2] = y2

#Initialize 4 workers
using Distributed
@everywhere using StructuredOptimization, Random

@everywhere function l1sep(λ1,λ2)     # opt function with random permutation of train and test data
    cvind = randperm(size(X)[1])
    Ytrain = Y[cvind[1:800],:]
    Ytest = Y[cvind[801:size(X)[1]],:]
    Xtrain = X[cvind[1:800],:]
    Xtest= X[cvind[801:size(X)[1]],:]
    B = Variable(size(X)[2],size(Y)[2])
    @minimize ls(Xtrain*B - Ytrain) + λ1*norm(B[:,1],1) + λ2*norm(B[:,2],1) with ZeroFPR()
    Bhat = copy(~B)
    Ytestpred = Xtest*Bhat
    MSEtest = (0.5*norm(Ytestpred-Ytest,2)^2)/(length(Ytest)*size(X)[2])
    return MSEtest
# Trying to run 4 optimizations in parallel doesn't work
XY = @spawn X,Y
# How to collect MSEtest from the different workers?

I can see that your function calling of pmap is not correct. Here a dummy example that maybe can help you:

using Distributed

@everywhere function foo(a,b)
    aa = a + b*9
    return aa

pmap(x->foo(x[1],x[2]),  [(1,2) (3,4)])

# or
X = [1 3]
Y = [2 4]
XY = [(X[i],Y[i]) for i in 1:length(X)]
pmap(x->foo(x[1],x[2]),  XY)

I’m not sure I follow you, but neither:

pmap(x->l1sep(x[1],x[2]),[Y X])


XY = [(Y[i,:],X[i,:]) for i in 1:size(X)[1]]