Storing data in Julia


#1

I am trying to save the outcome of a Monte Carlo simulation in Julia - 100 MC runs with 3 100-by-100 matrices, 2 100-by-1 arrays and 100 directed graphs. However, the size of the data comes up to 9gb. I currently use save(“filename.jld”). Is there a way to compress and store?


#2

There seems to be a compress=true option. Have you tried it? Also, maybe the graphs can be stored more efficiently by converting it to a sparse array? Or maybe use something in https://github.com/JuliaGraphs/GraphIO.jl?


#3

Compression could help a lot if you have repeated data, but unless disk space is your only concern, I’d first look into if you can reduce what you’re storing. Even if it compresses very efficiently, it’ll probably be slow to compress and decompress the data. And 9 GB puts a lot of pressure on your memory.

Assume 100x100 nodes, and 100 runs, the amount you’re storing per node pair is 9*2^30/(100*100^2) ≈ 9664 bytes. What is taking up all this space? What do the structures look like that you’re storing?


#4

One thing that’s good to check is which of your matrices, if any, can be sparse. Julia has pretty good sparse arrays in the stdlib. Even if you don’t want to actually use sparse matrices, you could make them sparse just for loading and storing. If you are storing your graphs as matrices, I’m guessing they should probably be sparse. From your description it sounds like your issue is with the graphs somehow.


#5

(In that case they should also compress extremely well.)