Julia libdl_find_library in cluster

Hi I am trying to run the Julia in my university cluster
In the cluster, Julia is not installed as a module so I am running the .jl file by giving the exact location of the Julia executable. As of now, My code was single-threaded and I did not face any problem. I am trying to install MKL_jll for my multi-threading linear-algebra calculation and it seems that they also blocked the cloning from GitHub. So I linked the mkl_rt using the Libdl option. It works perfectly fine if I run the code interactively on the master node. If I submit the job using PBS script I am getting error. I am attaching my test.jl code here.

#!/home/j_tanu/software/julia/bin/julia
using Libdl,LinearAlgebra,DelimitedFiles
using LinearAlgebra:BlasInt
global libmkl_rt = Libdl.find_library(["libmkl_rt"], ["/apps/compilers/intel-2018u3/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64"]);
function mkl_dsyevd!(A::StridedMatrix{Float64}, W::Vector{Float64})
    n = BlasInt(LinearAlgebra.checksquare(A))
    work  = Vector{Float64}(undef, 1)
    lwork = BlasInt(-1)
    iwork  = Vector{BlasInt}(undef, 1)
    liwork = BlasInt(-1)
    info  = Ref{BlasInt}()
    jobz='V'
    uplo='L'
    for i = 1:2
        ccall(("dsyevd", libmkl_rt), Cvoid,
            (Ref{UInt}, Ref{UInt}, Ref{BlasInt}, Ptr{Cdouble}, Ref{BlasInt},
            Ptr{Cdouble}, Ptr{Cdouble}, Ref{BlasInt}, Ptr{BlasInt}, Ref{BlasInt}, Ptr{BlasInt}),
            jobz, uplo, n, A, max(1,stride(A,2)), W, work, lwork, iwork, liwork, info)
                      # chklapackerror(info[])
            if i == 1
               lwork = BlasInt(real(work[1]))
                resize!(work, lwork);
                liwork = BlasInt(real(iwork[1]))
                resize!(iwork, liwork);
            end
    end
end
a=rand(12000,12000);
b=a+a';
c=fill(0.0,12000);
mkl_dsyevd!(b,c);
fd=open("julia_mkl_dsyevd_out.dat","w+");
writedlm(fd,c);
close(fd);

Now if I run ./test.jl It works perfectly. Now I am submitting the job using PBS script

#!/bin/bash
#PBS -N mkl-example
#PBS -l nodes=1:ppn=32
#PBS -o out.log
#PBS -e err.log
#PBS -q old_compute
#PBS -j oe
export JULIA_NUM_THREADS=32
cd $PBS_O_WORKDIR
./test.jl

I am getting an error

ERROR: LoadError: could not load symbol "dsyevd":
/home/j_tanu/software/julia/bin/julia: undefined symbol: dsyevd
Stacktrace:
 [1] mkl_dsyevd!(::Array{Float64,2}, ::Array{Float64,1}) at /home/j_tanu/test.jl:15
 [2] top-level scope at ./timing.jl:174 [inlined]
 [3] top-level scope at /home/j_tanu/julia_working_mkl_dsyevd_diag.jl:0
 [4] include(::Function, ::Module, ::String) at ./Base.jl:380
 [5] include(::Module, ::String) at ./Base.jl:368
 [6] exec_options(::Base.JLOptions) at ./client.jl:296
 [7] _start() at ./client.jl:506
in expression starting at /home/j_tanu/test.jl:31

I can not fully understand what causes the error. Maybe Libdl_find_library can not find the libmkl_rt in the worker node.

You might be able to get it running on a different machine (or a VM) and copy your ~/.julia directory over to the cluster?

global libmkl_rt = Libdl.find_library(["libmkl_rt"], ["/apps/compilers/intel-2018u3/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64"]);

should be

const libmkl_rt = Libdl.find_library(["libmkl_rt"], ["/apps/compilers/intel-2018u3/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64"]);

Not sure if that would fix it. If not, try running Libdl.dlopen and Libdl.dlsym to try to narrow it down.

I seem to recall that MKL requires certain environment variables to be set. Does your cluster provide it via a module system (typically a shell command something like module load mkl)?

const libmkl_rt = Libdl.find_library(["libmkl_rt"], ["/apps/compilers/intel-2018u3/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64"]);
does not fix the problem.
If I use Libdl.dlopen instead of Libdl.find_library it does not find the shared library

const libmkl_rt_path = "/bkup_apps/12oct2020/apps/compilers/intel-2018u3/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64/libmkl_rt.so"
libmkl_rt = Libdl.dlopen(libmkl_rt_path)

It throws an error

ERROR: LoadError: could not load library "/bkup_apps/12oct2020/apps/compilers/intel-2018u3/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64/libmkl_rt.so"
/bkup_apps/12oct2020/apps/compilers/intel-2018u3/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64/libmkl_rt.so: cannot open shared object file: No such file or directory
Stacktrace:
 [1] dlopen(::String, ::UInt32; throw_error::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109
 [2] dlopen at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Libdl/src/Libdl.jl:109 [inlined] (repeats 2 times)
 [3] top-level scope at /home/j_tanu/test.jl:11
 [4] include(::Function, ::Module, ::String) at ./Base.jl:380
 [5] include(::Module, ::String) at ./Base.jl:368
 [6] exec_options(::Base.JLOptions) at ./client.jl:296
 [7] _start() at ./client.jl:506

In the cluster, libmkl_rt.so is available in that specific location. Unfortunately, mkl is not available via module.

Good morning. If the admins block cloning from Github then maybe ask why? They might have good reasons.
It is a surprise that a research system blocks cloning from Github - how do people download packages they are interested in?
We need to find out more about this - do you know how the Git clone is blocked?
I ask because there have been issues with Julia being used behind proxies to download from Github in the past and this is related to the libraries with Julia uses.
Please give us some output.
Also do you know if there is a procy or proxy settings?

Please send us a copy and paste of the output when you add the package.

I am attaching the output

 Installing known registries into `~/.julia`
┌ Warning: could not download https://pkg.julialang.org/registries
└ @ Pkg.Types /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:951
┌ Warning: could not download https://pkg.julialang.org/registries
└ @ Pkg.Types /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:951
    Cloning registry from "https://github.com/JuliaRegistries/General.git"
ERROR: LoadError: failed to clone from https://github.com/JuliaRegistries/General.git, error: GitError(Code:ERROR, Class:OS, failed to connect to github.com: Connection timed out)
Stacktrace:
 [1] pkgerror(::String) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:52
 [2] clone(::Pkg.Types.Context, ::String, ::String; header::String, credentials::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/GitTools.jl:153
 [3] (::Pkg.Types.var"#94#97"{Pkg.Types.Context,String,Pkg.Types.RegistrySpec})(::String) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:1000
 [4] mktempdir(::Pkg.Types.var"#94#97"{Pkg.Types.Context,String,Pkg.Types.RegistrySpec}, ::String; prefix::String) at ./file.jl:682
 [5] mktempdir at ./file.jl:680 [inlined] (repeats 2 times)
 [6] clone_or_cp_registries(::Pkg.Types.Context, ::Array{Pkg.Types.RegistrySpec,1}, ::String) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:980
 [7] clone_or_cp_registries at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:970 [inlined] (repeats 2 times)
 [8] clone_default_registries(::Pkg.Types.Context; only_if_empty::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:874
 [9] clone_default_registries at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:862 [inlined]
 [10] find_registered!(::Pkg.Types.Context, ::Array{String,1}, ::Array{Base.UUID,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:1239
 [11] registry_resolve!(::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:770
 [12] add(::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}; preserve::Pkg.Types.PreserveLevel, platform::Pkg.BinaryPlatforms.Linux, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:176
 [13] add(::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:140
 [14] #add#21 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:67 [inlined]
 [15] add at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:67 [inlined]
 [16] #add#20 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:66 [inlined]
 [17] add at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:66 [inlined]
 [18] add(::String; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:65
 [19] add(::String) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:65
 [20] top-level scope at /home/j_tanu/package_install.jl:3
 [21] include(::Function, ::Module, ::String) at ./Base.jl:380
 [22] include(::Module, ::String) at ./Base.jl:368
 [23] exec_options(::Base.JLOptions) at ./client.jl:296
 [24] _start() at ./client.jl:506
in expression starting at /home/j_tanu/package_install.jl:3
caused by [exception 1]
GitError(Code:ERROR, Class:OS, failed to connect to github.com: Connection timed out)
Stacktrace:
 [1] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LibGit2/src/error.jl:106 [inlined]
 [2] clone(::SubString{String}, ::String, ::LibGit2.CloneOptions) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LibGit2/src/repository.jl:459
 [3] clone(::SubString{String}, ::String; branch::String, isbare::Bool, remote_cb::Ptr{Nothing}, credentials::LibGit2.CachedCredentials, callbacks::Dict{Symbol,Tuple{Ptr{Nothing},Any}}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LibGit2/src/LibGit2.jl:580
 [4] clone(::Pkg.Types.Context, ::String, ::String; header::String, credentials::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/GitTools.jl:143
 [5] (::Pkg.Types.var"#94#97"{Pkg.Types.Context,String,Pkg.Types.RegistrySpec})(::String) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:1000
 [6] mktempdir(::Pkg.Types.var"#94#97"{Pkg.Types.Context,String,Pkg.Types.RegistrySpec}, ::String; prefix::String) at ./file.jl:682
 [7] mktempdir at ./file.jl:680 [inlined] (repeats 2 times)
 [8] clone_or_cp_registries(::Pkg.Types.Context, ::Array{Pkg.Types.RegistrySpec,1}, ::String) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:980
 [9] clone_or_cp_registries at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:970 [inlined] (repeats 2 times)
 [10] clone_default_registries(::Pkg.Types.Context; only_if_empty::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:874
 [11] clone_default_registries at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:862 [inlined]
 [12] find_registered!(::Pkg.Types.Context, ::Array{String,1}, ::Array{Base.UUID,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:1239
 [13] registry_resolve!(::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Types.jl:770
 [14] add(::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}; preserve::Pkg.Types.PreserveLevel, platform::Pkg.BinaryPlatforms.Linux, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:176
 [15] add(::Pkg.Types.Context, ::Array{Pkg.Types.PackageSpec,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:140
 [16] #add#21 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:67 [inlined]
 [17] add at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:67 [inlined]
 [18] #add#20 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:66 [inlined]
 [19] add at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:66 [inlined]
 [20] add(::String; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:65
 [21] add(::String) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/API.jl:65
 [22] top-level scope at /home/j_tanu/package_install.jl:3
 [23] include(::Function, ::Module, ::String) at ./Base.jl:380
 [24] include(::Module, ::String) at ./Base.jl:368
 [25] exec_options(::Base.JLOptions) at ./client.jl:296
 [26] _start() at ./client.jl:506

Thankyou. It is difficult to figure out why this is blocked.
On the command line can you use wget to download a file from a website?

Wget is working from the command line.

Hmmm… I know I am risking annoying you here.
I guess when you try a git clone from the command line this fails?
Which would indicate a firewall rule which blocks access to github specifically?
Which seems mad to me…

git clone is also working fine

This is getting a bit beyond me.
There was an issue in the past with Julia and the libtls version on Ubuntu int he past.
I do not think this is the problem here.

Are you using a proxy? If you run env | grep -i proxy you can see if any proxy setting are made when you log in.

DO you know which OS the cluster uses?

I have some generic advice here. It involves a box of cookies.
Locate the cave where your cluster admins live. Do not be afraid. Courage. The admins wear black and they never smile. They play heavy metal music to keep people like you out of their cave.
Do not be afraid. Offer them cookies.

Depending on your country and culture, beer may also work.

Another bit of advice - you can run an interactive job in PBS.

qsub -i -l nodes=1:ppn=32
yo now get a command line prompt ona compute node. Can you run the julia program now?
Can you locate the mkl library on this node?

Yeha,
now I am able to install the package

:slight_smile: :+1:

Thank you very much :grin: :grin:
MKL_jll works perfectly fine.
I can not understand why Libdl.find_library failed in the worker node