Ok, having used JLD2 and JLD a bit now, I’ve actually gone back to just reading in the raw text and using that.
First, I had some issues with JLD2 where sometimes my data would become corrupted or somehow couldn’t be read. May be I did something wrong, may be it’s a bug, who knows. When I switched to JLD everything worked so I used that format for a bit. However, I like making changes to my code and this is an issue since the way data is saved is intimately linked with implementation details in the JLD format. For example, if you rename an object in your code, you can no longer read old data using the old name so you’re forced to load and save data again using the new implementation. This gets tedious fast.
This issue could be solved by having some default implementation from which you convert to load data and to which you convert to save data. The problem here is, that I have to decide on some default implementation (which I know I’m going to be tempted to change to get that x% performance improvement) and I don’t really want to spend my time benchmarking the three or four different ways I can think of to store and load ragged arrays. Also, if performance matters, well, converting all your data is going to cost some time. Another issue is the likelihood of introducing errors. While I do have unit tests and I try to be as thorough as I can, every time you load and save a file, there’s some chance you mess up the file path, accidentally overwrite a file, etc.
So, at this point in the thought process I just decided to use the existing csv like file format. I optimised my parsing a bit more so it’s a good bit faster now and reading files is only a bottle neck for the fastest of analyses anyhow. The simulations I’m doing take orders of magnitude more time per file than reading the input files, for example. I still use JLD to save results, though I may change that too. May be use pure HDF5 so I’m not so bound to implementation details. ![]()