How does RCall transfer data from R to Julia? I am wondering because @rget
takes a lot of time to transfer a dataframe from R to Julia (approx 2GB). Is it supposed to be instantaneous or is it supposed to be as slow as say, writing and then reading a file on a disk? I have no idea what it takes to transfer data from R to Julia.
In RCall every data from R to Julia will invoke function rcopy
, which will create a copy of the original R data in Julia (most of the time). For example, if a vector of integer in R is transferred, then RCall will create a new integer vector in Julia with the same content. So it is neither instantaneous nor as slow as writing and reading a file on a disk. Or it can be seen as “instantaneous” if the size of data is not big.
As to the dataframe, I think things will become a little more complicated than a vector. Still, RCall will create a copy in Julia with the same content, and since the dataframe is big, don’t expect it to be “instantaneous”.
And what is the purpose? If you want to read a dataframe in Julia, it should be better to directly read it into Julia than read it using RCall and then copy it to Julia.
The goal was to use the fread
function in R to read a CSV file that cannot be read by CSV.read
. Should it be as fast as copying a dataframe in Julia (i.e. from Julia to Julia?)?
I haven’t done any benchmark. As far as I see, copying in Julia should be always more performant than copying from R to Julia as copying R data to Julia will involve target type chosen and is not type stable in nature.
For example, integer vectors in R will either copy to integer vectors or vectors with missing value in Julia. I think the situation may be better in julia v0.7 but definitely not in julia v0.6. And RCall doesn’t work on julia v0.7 yet…
How about first reading this into Julia through RCall and then saving it using HDF5? So it can be read faster next time.