Storing and retrieving multi-dimensional arrays from a file

Let’s say, for some parameter k I have a 3-d array (in my case is a (N, N, M) array). Here, k can take a bunch of values.
I want to store the arrays in a file in such a way that I can retrieve them for each value of k later. I can’t figure out a good way to do this can someone please help me with this?

HDF5, with one dataset named repr(k) for each parameter k?

2 Likes

First use of JLD2.jl:

using JLD2
k = [1.0, π, 100.0]
f(k) = k*rand(2,2,4)
vk = [(k,f(k)) for k in k]
jldsave("k_fk.jld2", true; large_array=vk)
uk = load("k_fk.jld2")["large_array"]    # uk == vk  # true
k, fk = first.(uk), last.(uk)
2 Likes

Thank you very much! I found what I was looking for. Both HDF5.jl and JLD2.jl look feasible but what are the major differences in both (use cases or advantages/disadvantages)?

My very limited understanding from reading the docs is that JLD2 is a subset of HDF5 and that in order to preserve Julia objects, additional type information has to be supplied using “attributes”. This seems to be handled automatically by JLD2. The files created are still regular HDF5 files.

1 Like

JLD2 is a pure-julia implementation of a subset of the HDF5 format.
The output files are always valid HDF5 files, especially so for simple types, such as numbers, strings and arrays.
Other julia structs also work but they can only be loaded again with JLD2 in a useful manner.

HDF5 uses the HDF5 C library and offers a lot more fine-grained control but does not allow julia structs.

2 Likes