Distributed Julia computing failed in Rstudio

Ziyi_Chen · April 20, 2019, 9:00pm

I successfully setup JuliaCall in R:

install.packages(“JuliaCall”)
devtools::install_github(“Non-Contradiction/JuliaCall”) # get the development version of JuliaCall
library(JuliaCall)
julia=julia_setup(JULIA_HOME = “/Applications/Julia-1.1.app/Contents/Resources/julia/bin”) julia set up successfully

However, the parallel computing does not work in R:

julia_command(“using SharedArrays, Distributed, Profile”)
julia_command(“addprocs(4)”)
julia_command(“T=100;”)
julia_command(“a = SharedArray{Float64}(T,4);”)

julia_command(‘@distributed for j = 1:4
for i=2:T
a[i,j] = a[i-1,j]+1;
end;
end;’)

julia_command(“a”)

The printed “a” is array of all 0’s, but it should be 0:99 form top to bottom. Hence, I inferred that the parallel computing is not done at all. However, when I copy “@distributed for … end” in Juno, it works well.

Since I have already written down many lines of R code, I only want to change the computationally intensive part (which includes parallel computing) into Julia for acceleration, rather than rewrite everything in Julia.

How can I run Julia’s distributed computing in R?
Thank you.

kevbonham · April 21, 2019, 12:30am

Hi there! Hopefully, someone will be along to help you soon, but just FYI, it looks like you’re using quoting intended for speech rather than code. You want to use 3 backticks at the top and bottom of code blocks or press the button that looks like </>. Compare

function foo(speech)
println(“this doesn’t look good”)
end

to

function bar(code)
    println("ahh, much better")
end

Non-Contradiction · April 21, 2019, 12:37am

Hi Ziyi, I am the author of JuliaCall and I can replicate your problem. After some simple experiment, I suspect the issue occurs because JuliaCall doesn’t invoke Julia’s event loop, which is needed by parallel computing, especially in RStudio. If I invoke the Julia event loop manually, then the computing happens and a is updated.

For example, in console R (not in RStudio!), if I do the following after your code:

> julia_command("a") ## a is not updated  
> julia_console() ## invoke the julia console, which also invokes julia event loop in console R
Preparing julia REPL, press Ctrl + D to quit julia REPL.
You could get more information in how to use julia REPL at <https://docs.julialang.org/en/stable/manual/interacting-with-julia/>
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.3 (2018-12-18)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> ## Press Ctrl + D to quit the julia console.
> julia_command("a") ## then a is updated
100×4 SharedArray{Float64,2}:
  0.0   0.0   0.0   0.0
  1.0   1.0   1.0   1.0
  2.0   2.0   2.0   2.0
  3.0   3.0   3.0   3.0
  4.0   4.0   4.0   4.0
  5.0   5.0   5.0   5.0
  6.0   6.0   6.0   6.0
  7.0   7.0   7.0   7.0
  8.0   8.0   8.0   8.0
  9.0   9.0   9.0   9.0
  ⋮                    
 91.0  91.0  91.0  91.0
 92.0  92.0  92.0  92.0
 93.0  93.0  93.0  93.0
 94.0  94.0  94.0  94.0
 95.0  95.0  95.0  95.0
 96.0  96.0  96.0  96.0
 97.0  97.0  97.0  97.0
 98.0  98.0  98.0  98.0
 99.0  99.0  99.0  99.0

But this method will not work in RStudio, and I don’t have much idea about how to quit julia_console() programmatically.
I won’t have time to delve deeply into this until next weekend.
If you need this right now, I suggest that maybe you can try to remove the parallel computing part in Julia. Some years ago, I wrote a computation intensive R script for some Bayesian methods, and I made a considerable effort to optimize the R script and used some parallel computing in it. But I still got more than ten times speedup when I translated the script into very simple Julia without optimization and parallel computing.

Ziyi_Chen · April 21, 2019, 2:31am

Dear Changchen and kevbonham,

Thank you very much for your advice.

Coincidentally, I’m struggling with MCMC (Bayesian method) which needs large number of iterations to converge. Changchen’s advice provides me more motivation to translate into Julia.

Furthermore, I later found a way to parallel Julia code in RStudio by using R function “parLapply”. A simple experiment below shows “parLapply” (4 parallel processes) is faster than “lapply” (4 sequential processes), while both used Julia code for each process. In addition, as the pausing time (“sleep(15)”) increases, I found more time is saved by “parLapply”.

Thanks.

Best regards,

Ziyi Chen

> library(parallel)
> f_R=function(k){
  +   library(JuliaCall)
  +   julia2=julia_setup(JULIA_HOME = "/Applications/Julia-1.1.app/Contents/Resources/julia/bin")
  +   julia_command("sleep(15)")  #Pause 15s.
  +   return(k)
  + }
> n.cores=4
> cl=makeCluster(mc <- getOption("cl.cores", n.cores))

> N=10
> time_parLapply=rep(NA,N)
> for(k in 1:N){
  +   time_parLapply[k]=proc.time()[3]
  +   result_R=parLapply(cl=cl,X=1:n.cores,fun=f_R)
  +   time_parLapply[k]=proc.time()[3]-time_parLapply[k]
  + }
> stopCluster(cl)
> rm(cl)
> time_lapply=rep(NA,N)
> for(k in 1:N){
  +   time_lapply[k]=proc.time()[3]
  +   result_R=lapply(X=1:4,FUN=f_R)
  +   time_lapply[k]=proc.time()[3]-time_lapply[k]
  + }

> time_parLapply
[1] 34.081 15.007 15.006 15.005 15.009 15.005 15.008 15.006
[9] 15.007 15.007
> time_lapply
[1] 60.021 60.021 60.020 60.013 60.011 60.028 60.032 60.025
[9] 60.023 60.016
> mean(time_parLapply)
[1] 16.9141
> mean(time_lapply)
[1] 60.021

affans · April 21, 2019, 3:12am

Is there a reason you are interfacing Julia and R together? What you hope to achieve can be done purely in Julia.

Ziyi_Chen · April 21, 2019, 1:05pm

@affans I have already written many lines of R codes. For the part with small amount of computation but used many R functions, I would like to use the functions of R rather than translating so many lines of codes into Julia. In addition, my whole algorithm framework is in R, so I would like to change the computationally intensive part into Julia code for acceleration.

By integrating Julia and R, I strive a balance between running time and time consumption on translating code.

Yatma · April 30, 2019, 1:46pm

Hi Ziyi,

I don’t know what’s your OS but if you’re using Linux or Mac, you can use “mclapply” function from the same pkg Parallel, it’s pretty easier to use and to understand than “makeCluster” function according to me.

You just have to type:

mclappy(1:100, function(x){…}, detectCores = “the number of threads you want to allocate”). Here, 1:4 is your range and x will take his values in this range.

For my part, I’m looking to distribute Julia computations in several machines (or server) for example when I fit deep neural network model in order to avoid memory allocation with backpropagation.

If somone has any idea, i’m open.

Ziyi_Chen · April 30, 2019, 2:26pm

Hi Yatma,

I’m glad to receive your message and know that we both pursue parallel computing in R-Julia mixed computation.

I have tried mclapply on my Mac, but it often returned null list, so I swtiched to parLapply. However, it does not save much time compared to purely R. 4 Parallel threads in R takes about 90-110 minutes while 4 parallel threads by juliacall() within parLapply() take 78 minutes. 1 thread alone with juliacall() only takes 31-39 minutes.

I’m also open to who else having good idea for R-Julia acceleration.

Thanks.
Ziyi

markobudinich · November 29, 2019, 12:18pm

Hi Non-Contradiction,

First of all, thanks for the amazing package.

Did you have an update on this issue? I’m a Julia newbie, so for now I prefer to do my data cleaning in R (I use RStudio) and then pass all the relevant files to a Julia script. However, It will be nice to write directly in the RNotebook.

Also, I was wondering if you have the same issue using Jupyter Notebooks (I don’t use them with R currently, but perhaps it will be an option).

Best,
Marko

Non-Contradiction · November 29, 2019, 5:20pm

It is complicated for parallel computing from JuliaCall. It depends on the OS system, the R environment, and the Julia version you use. I just tried the code on my windows system with Julia 1.1, which crashes the RStudio and gives some error in R console and R Jupyter notebook. But it works for me months ago on my Mac with R console.
But it should be fine to use JuliaCall without parallel computing both in RStudio and Jupyter notebooks. Or it is also possible to use R from Julia using RCall.jl, so you have access to both R and parallel computing in Julia.

Ziyi_Chen · November 29, 2019, 5:27pm

Thank U all for your answers, which I will keep, though I’m currently not working on this problem now.

Topic		Replies	Views
Cannot run parallel for loop when calling R code General Usage parallel , r	4	599	August 16, 2020
Help with Julia Integration in RStudio General Usage r , juliacall	3	517	January 3, 2025
R package JuliaCall does not work with Julia v1.9.1 General Usage r	3	486	July 3, 2023
Segfault running JuliaCall from R General Usage question , juliacall	1	595	March 1, 2021
How to call my R package by RCall package inside julia? General Usage	8	1659	December 17, 2020

Distributed Julia computing failed in Rstudio

Related topics