How store this variable into files, and reaload it?

Darione · March 6, 2020, 11:34am

Hello, I have this two kind of variabiles, x and y, to save to file, and after to reload.

julia> x
Dict{Any,Any} with 2 entries:
  "day"        => [1, 2, 3]
  "beautifull" => [7, 5, 10, 11, 23, 56]
  
  
julia> y = [["WORD1","WORD2","WORD3"],[43,54,16]]
2-element Array{Array{T,1} where T,1}:
 ["WORD1", "WORD2", "WORD3"]
 [43, 54, 16]

What is the best way to store it? i have done it by using LDL2.jl package, but perhaps there are another ways by using another package, or another way yet in Julia base?

It’s better, for performance, if i declare the type of this vars, of i doesn’t mind - in terms of speed - in following elaborations of these vars?

Thanks for help.

roble · March 6, 2020, 12:10pm

I recently had a look at it.
One simple way would be by using Serialization.

using Serialization
serialize("filename", x)
loaded = deserialize("filename")

However the compatibility across different Julia versions is not guaranteed.

Darione · March 6, 2020, 3:06pm

Mmm, i have made some tests. I was loading and saving 3 variables (1.2MByte of dict, 1.7MByte of array and a little 1.5KByte of text). Time performance in loading the files stored, what i am interested in, are quite the same.

# with DLD2 , 3 vars into a single file
06/03/2020  15:49        17.481.362 julia-data-mv.ev.01.txt.jld2

#with serialize, 3 single files
06/03/2020  15:46             1.555 julia-data-mv.ev.01.txt.parts.dat
06/03/2020  15:46         1.205.052 julia-data-mv.ev.01.txt.wap.dat
06/03/2020  15:46         1.688.390 julia-data-mv.ev.01.txt.was.dat

The great difference is realy in the size of stored files.

With serialization, almost 3MByte.

With LDL2 package 17MByte (!!!) … but i have read, LDL2 has more features … and the fact to store all variable in a single file, it is not so bad to be honest, while in serialization in load and save one variabile at the time, it seems to me.

Ok, so, if for now i am interested in fast loading of data stored in some variables, i can use serialization without need of LDL2 package.

Any other advice? did I think well?
Thanks.

pdeffebach · March 6, 2020, 3:10pm

could you clarify what problem you are trying to solve? Could you help us understand why you want to save the variables to disk so frequently?

Darione · March 6, 2020, 3:25pm

Oh, i am doing a web app for text searching, in Julia and Genie. I wanted the max fast text elaboration, so i avoided pascal, php and python, and i choose Julia.

Not in the sense of google searching, but in sense of contextual search: i have, let’s say, 10 books - 10 utf8 text files - and i have to find all point in these texts, where some words are close to each other in a same context. To do this, i have to create ad “index” of a text, in some way. After, i look into this index. The first time I want to find something in some files, i create my index, and do the searching. The following searching, on the same files, will be more fast because i previosly saved the indexes, and so, no need to rebuild them each time i search.

Ok, so the problem is, create the index of a text file only one time (so i don’t mind if it take time), and load the index of it all the time you will serch into that file, instead of loading the original file and compute the index. The save is 1 time, but is the loading, that is frequent, and I wish it was as quick as possible.

rdeits · March 6, 2020, 3:25pm

Just to clarify, do you mean JLD2 instead of LDL2?

Darione · March 6, 2020, 3:27pm

Oh yes yes … JDL2 sorry!

hendri54 · March 6, 2020, 4:55pm

My experience with JLD2.jl and BSON.jl: sometimes it works and sometimes it doesn’t. I.e., I get various errors when trying to load the files.

I see no patterns in when the errors are thrown, other than that I am rarely able to load large files (large being only about 30MB).

This even holds when the files only store “built-in” Julia objects (Dict{Symbol, Matrix{Float64}} for example). I have given up trying to load user defined types.

Perhaps I am missing some alternatives that work, but from my experience and from the discussions on the various packages that load and save binary data here on discourse, I have come to the conclusion that I should avoid binary formats, unless I am prepared to lose my data.

I really, really hope that this conclusion is wrong. If not, this strikes me as a major problem for the Julia language.

Topic		Replies	Views
Loading entire jld2 file General Usage	12	3224	July 7, 2020
Save variable New to Julia	5	6134	May 9, 2023
Simple way to save several variables New to Julia question	10	2038	July 5, 2020
What is the preferred way to save variables? General Usage jld , hdf5 , jld2	39	18543	August 24, 2021
How to easily save Julia data? General Usage question , serialization	5	410	February 16, 2025

How store this variable into files, and reaload it?

Related topics