Running a function in parallel

Patrik_Waldmann · November 5, 2019, 4:24pm

I need help getting a function to run in parallel. I want to distribute both data and packages to all workers, see the following example (which doesn’t work):

#Simulated data (n observations, p variables, tr truevariables, sig error)
n, p, q, tr_p1, tr_p2, sig = 1000, 5000, 2, 10, 5, 0.5
X = randn(n, p)
Y = randn(n, q)
B_true1 = [randn(tr_p1)..., zeros(p-tr_p1)...]
B_true2 = [randn(tr_p2)..., zeros(p-tr_p2)...]
y1 = X*B_true1 + sig*randn(n)
y2 = X*B_true2 + sig*randn(n)
Y[:,1] = y1
Y[:,2] = y2

#Initialize 4 workers
using Distributed
addprocs(4)
@everywhere using StructuredOptimization, Random

@everywhere function l1sep(λ1,λ2)     # opt function with random permutation of train and test data
    cvind = randperm(size(X)[1])
    Ytrain = Y[cvind[1:800],:]
    Ytest = Y[cvind[801:size(X)[1]],:]
    Xtrain = X[cvind[1:800],:]
    Xtest= X[cvind[801:size(X)[1]],:]
    B = Variable(size(X)[2],size(Y)[2])
    @minimize ls(Xtrain*B - Ytrain) + λ1*norm(B[:,1],1) + λ2*norm(B[:,2],1) with ZeroFPR()
    Bhat = copy(~B)
    Ytestpred = Xtest*Bhat
    MSEtest = (0.5*norm(Ytestpred-Ytest,2)^2)/(length(Ytest)*size(X)[2])
    return MSEtest
end
# Trying to run 4 optimizations in parallel doesn't work
XY = @spawn X,Y
pmap(l1sep(5.0,5.0),XY)
# How to collect MSEtest from the different workers?

Noel_Araujo · November 5, 2019, 4:56pm

I can see that your function calling of pmap is not correct. Here a dummy example that maybe can help you:

using Distributed
addprocs(3)

@everywhere function foo(a,b)
    aa = a + b*9
    println(aa)
    return aa
end

pmap(x->foo(x[1],x[2]),  [(1,2) (3,4)])

# or
X = [1 3]
Y = [2 4]
XY = [(X[i],Y[i]) for i in 1:length(X)]
pmap(x->foo(x[1],x[2]),  XY)

Patrik_Waldmann · November 5, 2019, 6:14pm

I’m not sure I follow you, but neither:

pmap(x->l1sep(x[1],x[2]),[Y X])

nor

XY = [(Y[i,:],X[i,:]) for i in 1:size(X)[1]]
pmap(x->l1sep(x[1],x[2]),XY)

works.

Topic		Replies	Views
Is there a simple way to parallelize or distribute a function call that updates a local variable within a loop? General Usage question , parallel , distributed	1	300	September 12, 2023
Calling a package function with a parallelized for loop New to Julia	2	490	June 6, 2019
Why passing a distributed matrix to functions hurts performance? Performance performance , linearalgebra , distributed , matrices , functions	7	702	April 14, 2022
Performance issues with parallel Julia code Julia at Scale performance , parallel , distributed , scientific-computing	2	1132	October 29, 2021
Speed up parallel maximum across columns Performance parallel , distributed , loops	1	406	August 18, 2020

Running a function in parallel

Related topics