I need to store a moderate-to-large matrix, which (essentially) represents a least-squares system, since the matrix entries are much more expensive to evaluate than solving the actual system. I would therefore like to store this system on disk and reload as needed. What would be a convenient and performant file format / Julia package to manage this, satisfying the following requirements:
- The system can be subdivided into variable size blocks, with a total number of rows and columns in the range of thousands of blocks.
- Ideally, I’d like the flexibility that each block can be stored as a
Dictor equivalent (with its entires representing different “kinds” of data)
- I need the ability to load arbitrary sub-systems; both row- and columns-slices (representing different subsets of the data, or the parameters).
- I need to be able to add rows (data) and columns (parameters) to the system.
- I am unable to hold the entire system in memory.
I’ve had bad experiences with both JLD and JLD2 often unable to load the files I created. In addition JLD (a little more reliable than JLD2 for me) seemed slow? I was planning to try HDF5 next. I understand this is what JLD is based on but was hoping that restricting myself to the basic HDF5 data types would help with performance and robustness. Maybe I have just been using JLD2 incorrectly and should revisit, but JLD2 doesn’t seem to support loading slices? Maybe I should be looking at databases? (I have zero experience with DB.)
However, before I try many different approaches, I would appreciate any advise that people on this forum have.