Ocassional precompile error on a cluster using julia1.8

I’ve just started using julia 1.8 on a cluster, and I have faced the following error while instantiating (or adding packages to) an environment:

┌ Error: Pkg.precompile error
│   exception =
│    IOError: could not spawn `/scratch/user/julia/julia-1.8/bin/julia -Cnative -J/scratch/user/julia/julia-1.8/lib/julia/sys.so -g1 -O0 --output-ji /scratch/user/.julia/compiled/v1.8/IterTools/jl_CVGiTl --output-incremental=yes --startup-file=no --history-file=no --warn-overwrite=yes --color=yes -`: resource temporarily unavailable (EAGAIN)
│    Stacktrace:
│      [1] _spawn_primitive(file::String, cmd::Cmd, stdio::Vector{Union{RawFD, IO}})
│        @ Base ./process.jl:128
│      [2] #725
│        @ ./process.jl:139 [inlined]
│      [3] setup_stdios(f::Base.var"#725#726"{Cmd}, stdios::Vector{Union{RawFD, IO}})
│        @ Base ./process.jl:223
│      [4] _spawn
│        @ ./process.jl:138 [inlined]
│      [5] _spawn(::Base.CmdRedirect, ::Vector{Union{RawFD, IO}}) (repeats 2 times)
│        @ Base ./process.jl:166
│      [6] open(cmds::Base.CmdRedirect, stdio::Base.TTY; write::Bool, read::Bool)
│        @ Base ./process.jl:397
│      [7] open(cmds::Base.CmdRedirect, mode::String, stdio::Base.TTY)
│        @ Base ./process.jl:366
│      [8] create_expr_cache(pkg::Base.PkgId, input::String, output::String, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, internal_stderr::IO, internal_stdout::IO)
│        @ Base ./loading.jl:1443
│      [9] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, ignore_loaded_modules::Bool)
│        @ Base ./loading.jl:1524
│     [10] (::Pkg.API.var"#245#274"{IOBuffer, String, Base.PkgId})()
│        @ Pkg.API /scratch/user/julia/julia-1.8/share/julia/stdlib/v1.8/Pkg/src/API.jl:1310
│     [11] with_logstate(f::Function, logstate::Any)
│        @ Base.CoreLogging ./logging.jl:511
│     [12] with_logger
│        @ ./logging.jl:623 [inlined]
│     [13] macro expansion
│        @ /scratch/user/julia/julia-1.8/share/julia/stdlib/v1.8/Pkg/src/API.jl:1308 [inlined]
│     [14] (::Pkg.API.var"#242#271"{Bool, Pkg.Types.Context, Vector{Task}, Pkg.API.var"#handle_interrupt#263"{Base.Event, ReentrantLock, Base.TTY}, Pkg.API.var"#color_string#261", Base.Event, Base.Event, ReentrantLock, Vector{Base.PkgId}, Vector{Base.PkgId}, Dict{Base.PkgId, String}, Vector{Base.PkgId}, Vector{Base.PkgId}, Vector{Pkg.Types.PackageSpec}, Dict{Base.PkgId, Bool}, Dict{Base.PkgId, Base.Event}, Dict{Base.PkgId, Bool}, Vector{Base.PkgId}, Bool, Base.TTY, Base.Semaphore, String, Vector{String}, Vector{Base.PkgId}, Base.PkgId})()
│        @ Pkg.API ./task.jl:482
└ @ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.8/Pkg/src/API.jl:1183

I’m not sure what this issue is about, and I don’t face it on my laptop. Any help?

I’ve had some luck by following Limit how many threads used during precompile · Issue #2404 · JuliaLang/Pkg.jl · GitHub and setting ENV["JULIA_NUM_PRECOMPILE_TASKS"] to a low number (eg 4).

I’m having the same issue, but no luck getting a resolution even when I set ENV[“JULIA_NUM_PRECOMPILE_TASKS”] to 1 or 2.

My version info:

Commit 5544a0fab76 (2022-08-17 13:38 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 32 × Intel(R) Xeon(R) Gold 6238 CPU @ 2.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, cascadelake)
  Threads: 1 on 32 virtual cores
Environment:
  LD_LIBRARY_PATH = /opt/R/4.0.4/lib/R/lib:/lib:/usr/local/lib:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.302.b08-0.el8_4.x86_64/jre/lib/amd64/server
  LD_GOLD = /data/home/user/miniconda3/bin/x86_64-conda-linux-gnu-ld.gold
  JULIA_NUM_PRECOMPILE_TASKS = 1

My error

└ @ Base loading.jl:1662
OpenBLAS blas_thread_init: pthread_create failed for thread 11 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 12 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 13 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 14 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 15 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 16 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 17 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 18 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 19 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 20 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 21 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 22 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 23 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 24 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 25 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 26 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 27 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 28 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 29 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 30 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 31 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
ERROR: LoadError: IOError: could not spawn `/data/home/user/julia-1.8.0/bin/julia -Cnative -J/data/home/user/julia-1.8.0/lib/julia/sys.so -O0 -g1 --color=yes --startup-file=no -O0 --output-ji /data/home/user/.julia/compiled/v1.8/ChangesOfVariables/jl_9g5Po0 --output-incremental=yes --startup-file=no --history-file=no --warn-overwrite=yes --color=yes -`: resource temporarily unavailable (EAGAIN)
Stacktrace:
  [1] _spawn_primitive(file::String, cmd::Cmd, stdio::Vector{Union{RawFD, IO}})
    @ Base ./process.jl:128
  [2] #724
    @ ./process.jl:139 [inlined]
  [3] setup_stdios(f::Base.var"#724#725"{Cmd}, stdios::Vector{Union{RawFD, IO}})
    @ Base ./process.jl:223
  [4] _spawn
    @ ./process.jl:138 [inlined]
  [5] _spawn(::Base.CmdRedirect, ::Vector{Union{RawFD, IO}}) (repeats 2 times)
    @ Base ./process.jl:166
  [6] open(cmds::Base.CmdRedirect, stdio::IOContext{Base.PipeEndpoint}; write::Bool, read::Bool)
    @ Base ./process.jl:397
  [7] open(cmds::Base.CmdRedirect, mode::String, stdio::IOContext{Base.PipeEndpoint})
    @ Base ./process.jl:366
  [8] create_expr_cache(pkg::Base.PkgId, input::String, output::String, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, internal_stderr::IO, internal_stdout::IO)
    @ Base ./loading.jl:1590
  [9] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
    @ Base ./loading.jl:1671
 [10] compilecache
    @ ./loading.jl:1649 [inlined]
 [11] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1337
 [12] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
 [13] macro expansion
    @ ./loading.jl:1180 [inlined]
 [14] macro expansion
    @ ./lock.jl:223 [inlined]
 [15] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1144
 [16] include
    @ ./Base.jl:419 [inlined]
 [17] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::String)
    @ Base ./loading.jl:1554
 [18] top-level scope
    @ stdin:1
in expression starting at /data/home/user/.julia/packages/LogExpFunctions/mk9H3/src/LogExpFunctions.jl:1
in expression starting at stdin:1
ERROR: LoadError: Failed to precompile LogExpFunctions [2ab3a3ac-af41-5b50-aa03-7779005ae688] to /data/home/user/.julia/compiled/v1.8/LogExpFunctions/jl_2ZXM4h.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
    @ Base ./loading.jl:1705
  [3] compilecache
    @ ./loading.jl:1649 [inlined]
  [4] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1337
  [5] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
  [6] macro expansion
    @ ./loading.jl:1180 [inlined]
  [7] macro expansion
    @ ./lock.jl:223 [inlined]
  [8] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1144
  [9] include
    @ ./Base.jl:419 [inlined]
 [10] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::String)
    @ Base ./loading.jl:1554
 [11] top-level scope
    @ stdin:1
in expression starting at /data/home/user/.julia/packages/SpecialFunctions/hefUc/src/SpecialFunctions.jl:1
in expression starting at stdin:1
ERROR: LoadError: Failed to precompile SpecialFunctions [276daf66-3868-5448-9aa4-cd146d93841b] to /data/home/user/.julia/compiled/v1.8/SpecialFunctions/jl_dLkW9M.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
    @ Base ./loading.jl:1705
  [3] compilecache
    @ ./loading.jl:1649 [inlined]
  [4] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1337
  [5] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
  [6] macro expansion
    @ ./loading.jl:1180 [inlined]
  [7] macro expansion
    @ ./lock.jl:223 [inlined]
  [8] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1144
  [9] include
    @ ./Base.jl:419 [inlined]
 [10] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::String)
    @ Base ./loading.jl:1554
 [11] top-level scope
    @ stdin:1
in expression starting at /data/home/user/.julia/packages/ColorVectorSpace/bhkoO/src/ColorVectorSpace.jl:1
in expression starting at stdin:1
ERROR: LoadError: Failed to precompile ColorVectorSpace [c3611d14-8923-5661-9e6a-0046d554d3a4] to /data/home/user/.julia/compiled/v1.8/ColorVectorSpace/jl_pFTBqM.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
    @ Base ./loading.jl:1705
  [3] compilecache
    @ ./loading.jl:1649 [inlined]
  [4] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1337
  [5] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
  [6] macro expansion
    @ ./loading.jl:1180 [inlined]
  [7] macro expansion
    @ ./lock.jl:223 [inlined]
  [8] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1144
  [9] include
    @ ./Base.jl:419 [inlined]
 [10] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::String)
    @ Base ./loading.jl:1554
 [11] top-level scope
    @ stdin:1
in expression starting at /data/home/user/.julia/packages/ColorSchemes/E9e0B/src/ColorSchemes.jl:1
in expression starting at stdin:1
ERROR: LoadError: Failed to precompile ColorSchemes [35d6a980-a343-548e-a6ea-1d62b119f2f4] to /data/home/user/.julia/compiled/v1.8/ColorSchemes/jl_Xc5Qo8.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
    @ Base ./loading.jl:1705
  [3] compilecache
    @ ./loading.jl:1649 [inlined]
  [4] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1337
  [5] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
  [6] macro expansion
    @ ./loading.jl:1180 [inlined]
  [7] macro expansion
    @ ./lock.jl:223 [inlined]
  [8] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1144
  [9] include
    @ ./Base.jl:419 [inlined]
 [10] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::String)
    @ Base ./loading.jl:1554
 [11] top-level scope
    @ stdin:1
in expression starting at /data/home/user/.julia/packages/PlotUtils/igbcf/src/PlotUtils.jl:1
in expression starting at stdin:1
ERROR: LoadError: Failed to precompile PlotUtils [995b91a9-d308-5afd-9ec6-746e21dbc043] to /data/home/user/.julia/compiled/v1.8/PlotUtils/jl_aTyaYy.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
    @ Base ./loading.jl:1705
  [3] compilecache
    @ ./loading.jl:1649 [inlined]
  [4] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1337
  [5] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
  [6] macro expansion
    @ ./loading.jl:1180 [inlined]
  [7] macro expansion
    @ ./lock.jl:223 [inlined]
  [8] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1144
  [9] include
    @ ./Base.jl:419 [inlined]
 [10] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::Nothing)
    @ Base ./loading.jl:1554
 [11] top-level scope
    @ stdin:1
in expression starting at /data/home/user/.julia/packages/Plots/FCUr0/src/Plots.jl:1
in expression starting at stdin:1
Failed to precompile Plots [91a5bcdd-55d7-5caf-9e0b-520d859cae80] to /data/home/user/.julia/compiled/v1.8/Plots/jl_Jy6oNr.

Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
    @ Base ./loading.jl:1705
  [3] compilecache
    @ ./loading.jl:1649 [inlined]
  [4] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1337
  [5] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
  [6] macro expansion
    @ ./loading.jl:1180 [inlined]
  [7] macro expansion
    @ ./lock.jl:223 [inlined]
  [8] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1144
  [9] eval
    @ ./boot.jl:368 [inlined]
 [10] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
    @ Base ./loading.jl:1428```

No this is obviously a different issue, your error log indicates that the number of BLAS threads is the limiting factor. See:
https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.BLAS.set_num_threads

Thank you. I had already limited the BLAS threads using

using LinearAlgebra
LinearAlgebra.BLAS.set_num_threads(2)

The BLAS error persists but also for some precompile attempts it was not there leading me to think that the issue is elsewhere (i.e. what the OP is saying).

What does the minimal reproducer that causes the precompilation errors look like?
It’s weird that you limited the number of threads that BLAS may use to just two, but your error messages seem to say that 300 is the limit.

Perhaps the code you’re precompiling also calls LinearAlgebra.BLAS.set_num_threads?

Can you reproduce it from a clean Julia session? Perhaps it’s related to some linear algebra work running in the background while Julia is trying to precompile?

The error is from a clean session. I will mention that I share resources (RStudio Workbench) but I have the same result whether I run Julia from JupyterLab or the terminal. No other users use Julia. The error does not always appear when I precompile, so I wonder if it does have to do with the shared resources (our shared thread limit is 300). When I went to precompile packages today, I didn’t have problems with Plots.jl but did have the problem with another package. Is it just the timing? But usually if a package fails to precompile once I cannot get it to work for the remainder of the session (sometimes it will precompile during another session).

┌ Info: Precompiling MCMCChains [c7f686f2-ff18-58e9-bc7b-31028e88f75d]
└ @ Base loading.jl:1662
OpenBLAS blas_thread_init: pthread_create failed for thread 28 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 29 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 30 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 31 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 28 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 29 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 30 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
OpenBLAS blas_thread_init: pthread_create failed for thread 31 of 32: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 300 current, 300 max
ERROR: LoadError: IOError: could not spawn `/data/home/user/julia-1.8.0/bin/julia -Cnative -J/data/home/user/julia-1.8.0/lib/julia/sys.so -O0 -g1 --color=yes --startup-file=no -O0 --output-ji /data/home/user/.julia/compiled/v1.8/MicroCollections/jl_cY8Ch1 --output-incremental=yes --startup-file=no --history-file=no --warn-overwrite=yes --color=yes -`: resource temporarily unavailable (EAGAIN)
Stacktrace:
  [1] _spawn_primitive(file::String, cmd::Cmd, stdio::Vector{Union{RawFD, IO}})
    @ Base ./process.jl:128
  [2] #724
    @ ./process.jl:139 [inlined]
  [3] setup_stdios(f::Base.var"#724#725"{Cmd}, stdios::Vector{Union{RawFD, IO}})
    @ Base ./process.jl:223
  [4] _spawn
    @ ./process.jl:138 [inlined]
  [5] _spawn(::Base.CmdRedirect, ::Vector{Union{RawFD, IO}}) (repeats 2 times)
    @ Base ./process.jl:166
  [6] open(cmds::Base.CmdRedirect, stdio::IOContext{Base.PipeEndpoint}; write::Bool, read::Bool)
    @ Base ./process.jl:397
  [7] open(cmds::Base.CmdRedirect, mode::String, stdio::IOContext{Base.PipeEndpoint})
    @ Base ./process.jl:366
  [8] create_expr_cache(pkg::Base.PkgId, input::String, output::String, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, internal_stderr::IO, internal_stdout::IO)
    @ Base ./loading.jl:1590
  [9] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
    @ Base ./loading.jl:1671
 [10] compilecache
    @ ./loading.jl:1649 [inlined]
 [11] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1337
 [12] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
 [13] macro expansion
    @ ./loading.jl:1180 [inlined]
 [14] macro expansion
    @ ./lock.jl:223 [inlined]
 [15] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1144
 [16] include
    @ ./Base.jl:419 [inlined]
 [17] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::String)
    @ Base ./loading.jl:1554
 [18] top-level scope
    @ stdin:1
in expression starting at /data/home/user/.julia/packages/Transducers/HBMTc/src/Transducers.jl:1
in expression starting at stdin:1
ERROR: LoadError: Failed to precompile Transducers [28d57a85-8fef-5791-bfe6-a80928e7c999] to /data/home/user/.julia/compiled/v1.8/Transducers/jl_WHTXiQ.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
    @ Base ./loading.jl:1705
  [3] compilecache
    @ ./loading.jl:1649 [inlined]
  [4] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1337
  [5] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
  [6] macro expansion
    @ ./loading.jl:1180 [inlined]
  [7] macro expansion
    @ ./lock.jl:223 [inlined]
  [8] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1144
  [9] include
    @ ./Base.jl:419 [inlined]
 [10] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::String)
    @ Base ./loading.jl:1554
 [11] top-level scope
    @ stdin:1
in expression starting at /data/home/user/.julia/packages/AbstractMCMC/6aLyN/src/AbstractMCMC.jl:1
in expression starting at stdin:1
ERROR: LoadError: Failed to precompile AbstractMCMC [80f14c24-f653-4e6a-9b94-39d6b0f70001] to /data/home/user/.julia/compiled/v1.8/AbstractMCMC/jl_M8Y162.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
    @ Base ./loading.jl:1705
  [3] compilecache
    @ ./loading.jl:1649 [inlined]
  [4] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1337
  [5] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
  [6] macro expansion
    @ ./loading.jl:1180 [inlined]
  [7] macro expansion
    @ ./lock.jl:223 [inlined]
  [8] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1144
  [9] include
    @ ./Base.jl:419 [inlined]
 [10] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::Nothing)
    @ Base ./loading.jl:1554
 [11] top-level scope
    @ stdin:1
in expression starting at /data/home/user/.julia/packages/MCMCChains/IKF6o/src/MCMCChains.jl:1
in expression starting at stdin:1
Failed to precompile MCMCChains [c7f686f2-ff18-58e9-bc7b-31028e88f75d] to /data/home/user/.julia/compiled/v1.8/MCMCChains/jl_cXPm33.

Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
    @ Base ./loading.jl:1705
  [3] compilecache
    @ ./loading.jl:1649 [inlined]
  [4] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1337
  [5] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
  [6] macro expansion
    @ ./loading.jl:1180 [inlined]
  [7] macro expansion
    @ ./lock.jl:223 [inlined]
  [8] require(into::Module, mod::Symbol)
    @ Base ./loading.jl:1144
  [9] eval
    @ ./boot.jl:368 [inlined]
 [10] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
    @ Base ./loading.jl:1428

Whoever provides this setup (very probably it’s them) has configured RLIMIT_NPROC for your user to be 300. This means that user can’t at any point have more than 300 concurrently running tasks (each thread in each process is a task). When I say “user”, I mean OS user/Linux user.

It might be prudent run the command ulimit -a to find out about some of these kinds of limits that you’re bound by. Note that some of these are limits across your user, while others are per-process and per-thread. For me (personal laptop, I didn’t set any limits) the result is:

ulimit -a
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 29091
max locked memory           (kbytes, -l) 8192
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 29091
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

Do you know whether the human users each have their own Linux user, or do you all share the same Linux user? It might be interesting for you to run something like this to get the current visibly active users: ps -A -o uid | sort | uniq (replace uid with user to get user names instead of user IDs). Running just ps -o uid (without -A) should get you just the information for your own user.

You should know that Linux supports better and finer-grained ways to enforce resource limits (cgroups, namespaces). I think the RLIMIT_NPROC limit of 300 doesn’t actually make much sense. Seems to me it would be better to set that limit at something like 5000, and then limit the CPU-time available to all processes, the goal should be to make the system usable for everyone, not to make it unusable for everyone.

Thank you so much for the insight. It seems you are right about the 300 being a user limit. I admit, I don’t understand our setup very thoroughly as this is the first time it has become necessary for me to dig into it. Here are my ulimit -a results for reference:

ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 1030093
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 300
virtual memory          (kbytes, -v) 31309824

I think we each have our own Linux user. When I run ps -A -o uid | sort | uniq I can see group members have their own username/uid output. When we first started encountering the thread limit issue (it was showing up for R users as well) we upped the max user processes from from 200 to 300. It didn’t seem to me that I would hit a limit of 300 as a single user precompiling in Julia, so I’m still confused about getting these OpenBLAS resource errors, especially when I have set LinearAlgebra.BLAS.set_num_threads(). Is there anything else I can pursue from the Julia side of things to resolve/avoid this issue?

I have very limited understanding of the rationale behind our shared resource settings, but I will mention your suggestion of limiting cpu time rather than user processes as a potential solution from the server side.

1 Like

Thanks, this is very helpful! I obtain

-u: processes                       128

on my cluster, which is causing the same OpenBLAS blas_thread_init issue for me. I suppose the recent PRs set number of openblas threads to 1 while precompiling by KristofferC · Pull Request #46792 · JuliaLang/julia · GitHub and Limit initial OpenBLAS thread count by staticfloat · Pull Request #46844 · JuliaLang/julia · GitHub might help resolve this.

1 Like