Does anyone know the ‘correct’ way to write a scalar variable (especially a String) using parallel HDF5? I think I have a way of doing it, but it’s so ugly I’m hoping there’s a better way!
The problem: suppose I have a String on one MPI rank that I want to write to an HDF5 file that has been opened for parallel I/O. This is especially tricky because only that rank knows the actual length of the string.
I’ve looked for HDF5 documentation on how they recommend to do something like this, but haven’t managed to find anything relevant - the documentation I’ve found for parallel HDF5 doesn’t go very far (refs - HDF5: A Brief Introduction to Parallel HDF5 and Parallel HDF5 Questions).
My solution, reduced to a MWE, is
parallel-hdf5-test-script.jl:
using HDF5, MPI
function main()
MPI.Init()
my_rank = MPI.Comm_rank(MPI.COMM_WORLD)
output_file = h5open("test.h5", "cw", MPI.COMM_WORLD)
# Generate a stupid String as an example
s = "x" ^ rand(1:20)
# Broadcast the string length from the process we want to write from
string_size = Ref(length(s))
MPI.Bcast!(string_size, MPI.COMM_WORLD; root=0)
if my_rank != 0
s = " " ^ string_size[]
end
# This needs to be called on all processes, with a String of the right length
# The 'datatype' `var_hdf5_type` contains the length of `s`, which allows
# the following `write_dataset()` call to work.
io_var, var_hdf5_type = create_dataset(output_file, "foo", s)
if my_rank == 0
# Only need/want to write from a single rank
write_dataset(io_var, var_hdf5_type, s)
end
close(output_file)
end
main()
To run, assuming you have set up HDF5.jl with MPI support:
$ mpirun -np 2 julia parallel-hdf5-test-script.jl
The solution is not so awful, but it wasn’t at all obvious to me to start with (it must have been about the 5th or 6th thing I tried). So even if there isn’t a better way, I’m making this post so I can at least find this solution again!
I’ve always found string handling in HDF5 confusing, so an extra, more specific question: Is the Bcast!()
of the string length absolutely necessary? My current guess/understanding is that it is, because the length is part of the ‘string type’ that HDF5 uses to write it, and so has to be included when the variable is created (and variable creation is a collective process that has to be done on all ranks at once).