Loading data sometimes very slow on HPC system

fgerick · February 22, 2024, 9:16am

Hi there,
I am processing a bunch of JLD2 files (O(10GB)) on my local cluster. Sometimes (not always), loading a file takes ages (>10min), but sometimes it only takes ~10 seconds. Usually after loading it slowly once, the second time loads fast. I don’t think the issue is JLD2 only. Has anyone experienced something similar? Is it related to the hardware infrastructure or am I doing something wrong?

df -T says the filesystem is gpfs. I didn’t find anything here or on google that describes such a behaviour.

Clearly, I am not an expert. Thanks for any help!

barucden · February 22, 2024, 9:23am

Could you please clarify the sentence “Usually after loading it slowly once, the second time loads fast”? Do you load one file multiple times?

fgerick · February 22, 2024, 9:25am

Sorry, yes exactly: I just load the exact file again with the motivation of benchmarking.

carstenbauer · February 22, 2024, 9:53am

Hard to say what the issue here is, but parallel file systems can have all kinds of issues

fgerick · February 22, 2024, 9:57am

that was not the response I was hoping for

jsjie · February 22, 2024, 10:02am

The document said:
To exploit disk parallelism when reading a large file from a single-threaded application, whenever it can recognize a pattern, GPFS intelligently prefetches data into its buffer pool, ...
Maybe the reason is sometimes data are buffered while sometimes not?

johnh · February 22, 2024, 10:13am

This definitely sounds like the behaviour of buffers.

However if you are using a heirarchical storage management system it could eb old files are being pulled back from a slow tier - maybe tape.
Tme to bring cookies to the lair of your system admins.

fgerick · February 22, 2024, 10:23am

I really don’t think that tapes are involved here. These are not some old data files produced months/years ago (in fact the cluster didn’t exist a year ago). But maybe 50-100 MB/s hard drives are indeed close to the read speed.

barucden · February 22, 2024, 11:01am

It should be easy to test if the infrastructure is to blame, right? For example, you could simply cat each file and maybe count the bytes:

for file in *.jld2
do
    time cat $file | wc -c
done

If the time goes up sometimes, the problem is in the infrastructure.

fgerick · February 22, 2024, 11:14am

Thanks for this check. It seems like the infrastructure is not the issue:

21115272060

real	0m28.891s
user	0m0.075s
sys	0m14.022s
18912694826

real	0m24.032s
user	0m0.048s
sys	0m12.997s
18828809688

real	0m23.701s
user	0m0.052s
sys	0m12.657s
20063077414

real	0m28.950s
user	0m0.069s
sys	0m12.961s
19977486641

real	0m27.586s
user	0m0.063s
sys	0m13.536s
20651053712

real	0m27.257s
user	0m0.072s
sys	0m13.589s

So can it be that the data structure is the caveat? Is having a bunch (<10) of dataframes stored in a namedtuple a bad idea?

barucden · February 22, 2024, 11:40am

I am not sure. Long tuples are generally discouraged, but <10 is not long. Do all the named tuples share the field names? I could see it be a problem if the field names change (calling a function with a tuple with differently named fields triggers compilation each time), but my understanding is only superficial.

fgerick · February 22, 2024, 11:50am

It’s just one namedtuple per file, and in each file the namedtuple has the same field names. Here’s an example of the namedtuple that is saved in one .jld2 file:

@NamedTuple{esol::DataFrames.DataFrame, is1::Vector{Int64}, is2::Vector{Int64}, 
modes_observe::DataFrames.DataFrame, corrs_observe::Vector{Float64}, 
br_rms_observe::Vector{Vector{Float64}}, br_rms_ϕ_observe::Vector{Vector{Float64}}, 
uϕ_rms_observe::Vector{Vector{Float64}}, modes_geomag::DataFrames.DataFrame, 
corrs_geomag::Vector{Float64}, modes_u::DataFrames.DataFrame, corrs_u::Vector{Float64}}

The dataframes are all the same structure:

typeof.(eachcol(filtered_modes.esol))

4-element Vector{DataType}:
 Vector{ComplexF64} (alias for Array{Complex{Float64}, 1})
 Vector{Float64} (alias for Array{Float64, 1})
 Vector{Float64} (alias for Array{Float64, 1})
 Vector{Vector{ComplexF64}} (alias for Array{Array{Complex{Float64}, 1}, 1})

I do not see an issue with such a datastructure. Nothing is particularly obscure or type unstable.

barucden · February 22, 2024, 11:57am

Me neither! Do you think this Slow performance on Tuple{Type1, Type2} · Issue #2 · JuliaIO/JLD2.jl · GitHub could be related? If so, then it was addressed a month ago Add @nospecializeinfer around worst offenders by JonasIsensee · Pull Request #527 · JuliaIO/JLD2.jl · GitHub

fgerick · February 22, 2024, 1:38pm

That looks related, but I updated to the latest JLD2 version and the issue remains. When I load a file that has not been loaded before, it takes ages. Is the filesystem check with cat really comparable here? If I use two different julia sessions in one job (i.e. in one environment on one node), after loading the file once, the other session also loads the file fast the first time. If I start another slurm job on another node, the first load of that same file is slow. It seems like it really has to do with the distributed file system, but somehow cat does not capture this?

abraemer · February 22, 2024, 1:48pm

You are likely running cat on the login node or something? Maybe the issue only arises when the compute node access the memory for the first time?

fgerick · February 22, 2024, 1:55pm

Hmm no that does not seem to be the case. cat gives the same timing from anywhere, on files that have not been accessed before. I can cat a file in 20seconds, but if JLD2 hasn’t loaded it before, JLD2 takes >400 seconds to load it. The second time (even if restarting the julia session), JLD2 loads it in 20 seconds.

djholiver · February 22, 2024, 1:56pm

when you say “local” do you mean something literally on your laptop (hosted on a docker k8s instance or swarm) or do you mean the one you use in your network?

if the latter - have you asked your admins if there is any auto shutdown/restart or shared services / queues /activities taking place?

Regards,

fgerick · February 22, 2024, 2:03pm

“local” means a university HPC cluster that is not something like azure, aws etc. It’s SLURM managed and it’s about a few hundred nodes with lots of storage (kTB). I have not asked the admins about data-access issues as of now. I’m just at the stage of trying to narrow down what is happening.

juliohm · February 22, 2024, 2:23pm

I’ve had similar experience in the past, just sharing here. I didn’t have the chance to debug it in detail since it is a random issue, and I don’t have the computer science background to inspect file systems.

tbeason · February 22, 2024, 3:00pm

I think you should try to figure out if this is related to JLD2 or not. The easy way to do this is to save a copy of the data in another format and repeat your reading experiments.

Topic		Replies	Views
JLD takes too long reading names from a file Data jld	2	1098	April 11, 2017
Can't read old JLD2 file Tooling	17	3022	February 19, 2019
Dict loaded from jld file very long time... Why 22 seconds? General Usage jld	7	1818	August 24, 2017
JLD2 seems slow at write operations compared to serialize and HDF5 General Usage data	3	1181	November 20, 2017
JLD unable to load files from separate folders? Data jld , hdf5 , jld2	0	636	December 20, 2017

Loading data sometimes very slow on HPC system

Related topics