Distributed Julia computing failed in Rstudio

I successfully setup JuliaCall in R:

install.packages(“JuliaCall”)
devtools::install_github(“Non-Contradiction/JuliaCall”) # get the development version of JuliaCall
library(JuliaCall)
julia=julia_setup(JULIA_HOME = “/Applications/Julia-1.1.app/Contents/Resources/julia/bin”) #Julia set up successfully

However, the parallel computing does not work in R:

julia_command(“using SharedArrays, Distributed, Profile”)
julia_command(“addprocs(4)”)
julia_command(“T=100;”)
julia_command(“a = SharedArray{Float64}(T,4);”)

julia_command(’@distributed for j = 1:4
for i=2:T
a[i,j] = a[i-1,j]+1;
end;
end;’)

julia_command(“a”)

The printed “a” is array of all 0’s, but it should be 0:99 form top to bottom. Hence, I inferred that the parallel computing is not done at all. However, when I copy “@distributed for … end” in Juno, it works well.

Since I have already written down many lines of R code, I only want to change the computationally intensive part (which includes parallel computing) into Julia for acceleration, rather than rewrite everything in Julia.

How can I run Julia’s distributed computing in R?
Thank you.

Hi there! Hopefully, someone will be along to help you soon, but just FYI, it looks like you’re using quoting intended for speech rather than code. You want to use 3 backticks at the top and bottom of code blocks or press the button that looks like </>. Compare

function foo(speech)
println(“this doesn’t look good”)
end

to

function bar(code)
    println("ahh, much better")
end

Hi Ziyi, I am the author of JuliaCall and I can replicate your problem. After some simple experiment, I suspect the issue occurs because JuliaCall doesn’t invoke Julia’s event loop, which is needed by parallel computing, especially in RStudio. If I invoke the Julia event loop manually, then the computing happens and a is updated.

For example, in console R (not in RStudio!), if I do the following after your code:

> julia_command("a") ## a is not updated  
> julia_console() ## invoke the julia console, which also invokes julia event loop in console R
Preparing julia REPL, press Ctrl + D to quit julia REPL.
You could get more information in how to use julia REPL at <https://docs.julialang.org/en/stable/manual/interacting-with-julia/>
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.3 (2018-12-18)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> ## Press Ctrl + D to quit the julia console.
> julia_command("a") ## then a is updated
100×4 SharedArray{Float64,2}:
  0.0   0.0   0.0   0.0
  1.0   1.0   1.0   1.0
  2.0   2.0   2.0   2.0
  3.0   3.0   3.0   3.0
  4.0   4.0   4.0   4.0
  5.0   5.0   5.0   5.0
  6.0   6.0   6.0   6.0
  7.0   7.0   7.0   7.0
  8.0   8.0   8.0   8.0
  9.0   9.0   9.0   9.0
  ⋮                    
 91.0  91.0  91.0  91.0
 92.0  92.0  92.0  92.0
 93.0  93.0  93.0  93.0
 94.0  94.0  94.0  94.0
 95.0  95.0  95.0  95.0
 96.0  96.0  96.0  96.0
 97.0  97.0  97.0  97.0
 98.0  98.0  98.0  98.0
 99.0  99.0  99.0  99.0

But this method will not work in RStudio, and I don’t have much idea about how to quit julia_console() programmatically.
I won’t have time to delve deeply into this until next weekend.
If you need this right now, I suggest that maybe you can try to remove the parallel computing part in Julia. Some years ago, I wrote a computation intensive R script for some Bayesian methods, and I made a considerable effort to optimize the R script and used some parallel computing in it. But I still got more than ten times speedup when I translated the script into very simple Julia without optimization and parallel computing.

Dear Changchen and kevbonham,

Thank you very much for your advice.

Coincidentally, I’m struggling with MCMC (Bayesian method) which needs large number of iterations to converge. Changchen’s advice provides me more motivation to translate into Julia.

Furthermore, I later found a way to parallel Julia code in RStudio by using R function “parLapply”. A simple experiment below shows “parLapply” (4 parallel processes) is faster than “lapply” (4 sequential processes), while both used Julia code for each process. In addition, as the pausing time (“sleep(15)”) increases, I found more time is saved by “parLapply”.

Thanks.

Best regards,

Ziyi Chen

> library(parallel)
> f_R=function(k){
  +   library(JuliaCall)
  +   julia2=julia_setup(JULIA_HOME = "/Applications/Julia-1.1.app/Contents/Resources/julia/bin")
  +   julia_command("sleep(15)")  #Pause 15s.
  +   return(k)
  + }
> n.cores=4
> cl=makeCluster(mc <- getOption("cl.cores", n.cores))

> N=10
> time_parLapply=rep(NA,N)
> for(k in 1:N){
  +   time_parLapply[k]=proc.time()[3]
  +   result_R=parLapply(cl=cl,X=1:n.cores,fun=f_R)
  +   time_parLapply[k]=proc.time()[3]-time_parLapply[k]
  + }
> stopCluster(cl)
> rm(cl)
> time_lapply=rep(NA,N)
> for(k in 1:N){
  +   time_lapply[k]=proc.time()[3]
  +   result_R=lapply(X=1:4,FUN=f_R)
  +   time_lapply[k]=proc.time()[3]-time_lapply[k]
  + }

> time_parLapply
[1] 34.081 15.007 15.006 15.005 15.009 15.005 15.008 15.006
[9] 15.007 15.007
> time_lapply
[1] 60.021 60.021 60.020 60.013 60.011 60.028 60.032 60.025
[9] 60.023 60.016
> mean(time_parLapply)
[1] 16.9141
> mean(time_lapply)
[1] 60.021

Is there a reason you are interfacing Julia and R together? What you hope to achieve can be done purely in Julia.

@affans I have already written many lines of R codes. For the part with small amount of computation but used many R functions, I would like to use the functions of R rather than translating so many lines of codes into Julia. In addition, my whole algorithm framework is in R, so I would like to change the computationally intensive part into Julia code for acceleration.

By integrating Julia and R, I strive a balance between running time and time consumption on translating code.

Hi Ziyi,

I don’t know what’s your OS but if you’re using Linux or Mac, you can use “mclapply” function from the same pkg Parallel, it’s pretty easier to use and to understand than “makeCluster” function according to me.

You just have to type:

mclappy(1:100, function(x){…}, detectCores = “the number of threads you want to allocate”). Here, 1:4 is your range and x will take his values in this range.

For my part, I’m looking to distribute Julia computations in several machines (or server) for example when I fit deep neural network model in order to avoid memory allocation with backpropagation.

If somone has any idea, i’m open. :slight_smile:

Hi Yatma,

I’m glad to receive your message and know that we both pursue parallel computing in R-Julia mixed computation.

I have tried mclapply on my Mac, but it often returned null list, so I swtiched to parLapply. However, it does not save much time compared to purely R. 4 Parallel threads in R takes about 90-110 minutes while 4 parallel threads by juliacall() within parLapply() take 78 minutes. 1 thread alone with juliacall() only takes 31-39 minutes.

I’m also open to who else having good idea for R-Julia acceleration.

Thanks.
Ziyi