Question about JLD2 save/load of RData nested list

suppose i have this

module jldtest

	using DataFrames
	using JLD2, FileIO
	using RData
	using RCall

	# make some R data 
	R"""
	rdata <- list(a = runif(3), b = letters[1:3])
	save(rdata, file = 'rr.RData')
	"""

	d = DataFrame(a = 1:3,b = rand(3))
	save(df) = FileIO.save("test.jld2",Dict("data" => df))

	function dosave()
		d = Dict()
		d[:jl] = DataFrame(a = 1:3,b = rand(3))
		d[:r]  = FileIO.load("rr.RData")
		FileIO.save("test.jld2", Dict("data" => d))
	end

	doload() = FileIO.load("test.jld2")["data"]

	function dowork()
		d = doload()
		# work
		return nothing
	end

end

now let’s run this

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.1.0 (2019-01-21)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> include("jld2.jl")
Main.jldtest

julia> jldtest.dosave()

julia> jldtest.doload()
┌ Warning: type DataFrames.DataFrame does not exist in workspace; reconstructing
â”” @ JLD2 ~/.julia/packages/JLD2/KjBIK/src/data.jl:1153
┌ Warning: type DataFrames.Index does not exist in workspace; reconstructing
â”” @ JLD2 ~/.julia/packages/JLD2/KjBIK/src/data.jl:1153
┌ Warning: type RData.DictoVec{Array{Any,1}} does not exist in workspace; reconstructing
â”” @ JLD2 ~/.julia/packages/JLD2/KjBIK/src/data.jl:1153
Dict{Any,Any} with 2 entries:
  :jl => ##DataFrames.DataFrame#364(AbstractArray{T,1} where T[[1, 2, 3], [0.54321, 0.673996, 0.666545]], ##DataFrames.Index#363(Dict(:a=>1,:b=>2), Symbol[:a…
  :r  => Dict{String,Any}("rdata"=>##RData.DictoVec{Array{Any,1}}#365(Any[[0.895732, 0.391799, 0.953225], ["a", "b", "c"]], Dict("b"=>2,"a"=>1), Dict(2=>"b",…

  • I think i understand the warning: because in the global module there is no DataFrames module loaded, julia has to reconstruct the type.
  • however, why is that the case even if i don’t return the data at all into the global workspace?
julia> jldtest.dowork()
┌ Warning: type DataFrames.DataFrame does not exist in workspace; reconstructing
â”” @ JLD2 ~/.julia/packages/JLD2/KjBIK/src/data.jl:1153
┌ Warning: type DataFrames.Index does not exist in workspace; reconstructing
â”” @ JLD2 ~/.julia/packages/JLD2/KjBIK/src/data.jl:1153
┌ Warning: type RData.DictoVec{Array{Any,1}} does not exist in workspace; reconstructing
â”” @ JLD2 ~/.julia/packages/JLD2/KjBIK/src/data.jl:1153

  • Even though i say using DataFrames within my module, I get a warning that the datatype DataFrame needs to be reconstructed.
  • I have an application that loads an R nested list as an RData.DictoVec{Array{Any,1}} into a module where I said using RData. the app crashes because it cannot reconstruct (as it can with the dataframe). I’m unsure why it crashes (above example doesn’t crash)
  • I have to do using RData, then load my module, then execute the function that loads the data. My script thus looks like
using RData
using MyModule
MyModule.run()
  • I want to get rid of using RData on the first line. any solution?

but aren’t you loading RData into the current module (probably Main here), rather than MyModule?

That’s my current solution yes but I don’t like it. The last snippet fails without saying using RData in Main

I meant “JLD2-loading” into Main.
Without using RData also in Main, is there enough info?

Oh I see, your question is why can it reconstruct DataFrames but not RData?

1 Like