How to save parallel outputs to the same file on HPC cluster

I am doing Monte Carlo study on HPC cluster. I parallel the 100 Monte Carlo replications using job array in Slurm. I want to save the results of each replication to the same file. Let’s say for replication ID n, I have two results, both of them are small Vector{Float64}. Then, I would like to save them as follows:

n_1  result_1  result_2
n_2  result_1  result_2
n_3  result_1  result_2

I think I am looking for a proper package for saving data on HPC cluster and some code like this

save!("file", "$SLURM_ARRAY_TASK_ID", "$result_1", "$result_2")

Any suggestions will be appreciated.

One way to do this would be to use MPI.jl with HDF5.jl. You will need to point it at an existing HDF5 installation on your cluster, as parallel HDF5 is not enabled in the HDF5_jll binary. There is a test in the HDF5 package that may serve as a reference for you. There doesn’t seem to be a lot of documentation on this, but the Python wrapper h5py’s documentation may also help you.

Practically, I usually just write a bunch of separate data files to cluster scratch, and then run another job to combine them if necessary.

Whenever I can, I use a database to handle concurrency. For example, LibPQ.jl and SQLite.jl are good. I believe using a database, even an ephemeral one, will outperform manually handling locks to the IO.

1 Like

IIRC single-file storage systems like HDF5 and SQLite don’t work on NFS. How do people handle this on HPC cluster? (Scratch space?) I just use a dumb solution that allocates different file path for different “task” as it is OK for my usecase.

1 Like

Thanks for your reply. So it seems that it is a little bit difficult to do it. If I instead save result of each replication to a unique file, I wonder how I should do it? When running locally, I use the DelimitedFiles.jl package. But I couldn’t find where it saves the result on HPC.

Actually, you might have a good method there.
Write to local storage on the node into $TMPDIR and at the end of the job bring all the output files to the final storage place.
Local storage is most often fastest.

But to write to HDF5 format with MPI-IO you need a filesystem which will support it.
You say Scratch space - I think your scratch storage probably does support it.

As a sanity check, I’d make sure that the working directories of the julia processes are the intended one. If not, maybe just using a full path would fix it?

Thanks! Yeah, I think I’ll use those techniques if I have an I/O-intensive program.