HDF5 build failure on HPC

Hello everyone,

could you please spend a minute to look at these error message ? I am trying to install MLDatasets on an HPC cluster but I got the following error while building and using HDF5. I know that a library is missing but I cannot fix it. I don’t even know how can I ask for help from the administrator. Any suggestions ?

Thank you very much …

program file to install packages

ENV["JULIA_PKG_SERVER"]="https://mirrors.tuna.tsinghua.edu.cn/julia"
using Pkg
Pkg.build("HDF5")  # error here
using HDF5  # error here
using SpecialFunctions
using Zygote
using JLD2
using PyCall
using CUDA
using Flux
using BSON
using MLDatasets
using Lathe

output file

--------------- PROGRAM OUTPUT ----------------
β”Œ Warning: The call to compilecache failed to create a usable precompiled cache file for LibSSH2_jll [29816b5a-b9ab-546f-933c-edad1886dfa8]
β”‚   exception = Required dependency MbedTLS_jll [c8ffd9c3-330d-5841-b78e-0817d7145fa1] failed to load from a cache file.
β”” @ Base loading.jl:1042
β”Œ Warning: The call to compilecache failed to create a usable precompiled cache file for LibCURL_jll [deac9b47-8bc7-5906-a0fe-35ac56dc84c0]
β”‚   exception = Required dependency MbedTLS_jll [c8ffd9c3-330d-5841-b78e-0817d7145fa1] failed to load from a cache file.
β”” @ Base loading.jl:1042
β”Œ Warning: The call to compilecache failed to create a usable precompiled cache file for LibSSH2_jll [29816b5a-b9ab-546f-933c-edad1886dfa8]
β”‚   exception = Required dependency MbedTLS_jll [c8ffd9c3-330d-5841-b78e-0817d7145fa1] failed to load from a cache file.
β”” @ Base loading.jl:1042
β”Œ Warning: The call to compilecache failed to create a usable precompiled cache file for LibSSH2_jll [29816b5a-b9ab-546f-933c-edad1886dfa8]
β”‚   exception = Required dependency MbedTLS_jll [c8ffd9c3-330d-5841-b78e-0817d7145fa1] failed to load from a cache file.
β”” @ Base loading.jl:1042
β”Œ Warning: The call to compilecache failed to create a usable precompiled cache file for HDF5_jll [0234f1f7-429e-5d53-9886-15a909be8d59]
β”‚   exception = Required dependency MbedTLS_jll [c8ffd9c3-330d-5841-b78e-0817d7145fa1] failed to load from a cache file.
β”” @ Base loading.jl:1042
β”Œ Warning: The call to compilecache failed to create a usable precompiled cache file for LibCURL_jll [deac9b47-8bc7-5906-a0fe-35ac56dc84c0]
β”‚   exception = Required dependency MbedTLS_jll [c8ffd9c3-330d-5841-b78e-0817d7145fa1] failed to load from a cache file.
β”” @ Base loading.jl:1042
β”Œ Warning: The call to compilecache failed to create a usable precompiled cache file for LibSSH2_jll [29816b5a-b9ab-546f-933c-edad1886dfa8]
β”‚   exception = Required dependency MbedTLS_jll [c8ffd9c3-330d-5841-b78e-0817d7145fa1] failed to load from a cache file.
β”” @ Base loading.jl:1042
ERROR: LoadError: LoadError: could not load library "libhdf5-00e8fae8.so.200.0.0"
libhdf5-00e8fae8.so.200.0.0: cannot open shared object file: No such file or directory
Stacktrace:
 [1] dlopen(::String, ::UInt32; throw_error::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109
 [2] dlopen at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109 [inlined] (repeats 2 times)
 [3] top-level scope at /csns_workspace/CSG/lianyl/.julia/packages/HDF5/iH4LA/src/api_types.jl:119
 [4] include(::Function, ::Module, ::String) at ./Base.jl:380
 [5] include at ./Base.jl:368 [inlined]
 [6] include(::String) at /csns_workspace/CSG/lianyl/.julia/packages/HDF5/iH4LA/src/HDF5.jl:1
 [7] top-level scope at /csns_workspace/CSG/lianyl/.julia/packages/HDF5/iH4LA/src/HDF5.jl:51
 [8] include(::Function, ::Module, ::String) at ./Base.jl:380
 [9] include(::Module, ::String) at ./Base.jl:368
 [10] top-level scope at none:2
 [11] eval at ./boot.jl:331 [inlined]
 [12] eval(::Expr) at ./client.jl:467
 [13] top-level scope at ./none:3
in expression starting at /csns_workspace/CSG/lianyl/.julia/packages/HDF5/iH4LA/src/api_types.jl:119
in expression starting at /csns_workspace/CSG/lianyl/.julia/packages/HDF5/iH4LA/src/HDF5.jl:51
   Building HDF5 β†’ `~/.julia/packages/HDF5/iH4LA/deps/build.log`
ERROR: LoadError: Failed to precompile HDF5 [f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f] to /csns_workspace/CSG/lianyl/.julia/compiled/v1.5/HDF5/L7Dga_3RSiF.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1305
 [3] _require(::Base.PkgId) at ./loading.jl:1030
 [4] require(::Base.PkgId) at ./loading.jl:928
 [5] require(::Module, ::Symbol) at ./loading.jl:923
 [6] include(::Function, ::Module, ::String) at ./Base.jl:380
 [7] include(::Module, ::String) at ./Base.jl:368
 [8] exec_options(::Base.JLOptions) at ./client.jl:296
 [9] _start() at ./client.jl:506
in expression starting at /csns_workspace/CSG/lianyl/test_CUDA.jl:17
--------------- PROGRAM COMPLETED--------------

strangely enough, when I build HDF5 (and then MLDatasets) from the login node, no error has ever occurred.

In most (not all) HPC cluster architectures the cluster compute nodes are on a private network.
Normally they access the Internet by a NAT through the cluster head node (there are other architectures)
Cluster head nodes may have to contact license servers also, via a NAT (Network Address Translation)
It may be that on your cluster the cluster compute nodes cannot access servers outside the cluster.

What does the build log contain?
~/.julia/packages/HDF5/iH4LA/deps/build.log

Please do ask your cluster admins for advice.
I recommend bringing cookies. Throw the cookies between the bars of their cage and they will be like pussy cats.

3 Likes

Meow