I am trying to use the ClusterManagers package for parallel computing, but I am getting an error message that I don’t understand. I am writing my code in a Jupyter notebook in VS Code, and this is the part of the initial code that I am trying to start from:
using ClusterManagers
using Distributed
OnCluster = true #set to false to run locally
addWorkers = true #set to false to run serially
println("OnCluster = $(OnCluster)")
# Current number of workers
currentWorkers = nworkers()
println("Initial number of workers = $(currentWorkers)")
# Increase the number of workers available
maxNumberWorkers = 10
if addWorkers == true
if OnCluster == true
addprocs(SlurmManager(maxNumberWorkers))
else
addprocs(maxNumberWorkers)
end
end
However, I get an error message:
OnCluster = true
Initial number of workers = 1
Error launching Slurm job:
Output exceeds the size limit. Open the full output data in a text editor
TaskFailedException
nested task error: IOError: could not spawn `srun -J julia-59209 -n 10 -o '/Users/my_name/Documents/my_project/Julia Code/Quantitative (simple)/./julia-59209-16770052037-%4t.out' -D '/Users/my_name/Documents/my_project/Julia Code/Quantitative (simple)' /System/Volumes/Data/Applications/Julia-1.8.app/Contents/Resources/julia/bin/julia --worker=arf01ZXwDrm50sEj`: no such file or directory (ENOENT)
Stacktrace:
[1] _spawn_primitive(file::String, cmd::Cmd, stdio::Vector{Union{RawFD, IO}})
@ Base ./process.jl:128
[2] #725
@ ./process.jl:139 [inlined]
[3] setup_stdios(f::Base.var"#725#726"{Cmd}, stdios::Vector{Union{RawFD, IO}})
@ Base ./process.jl:223
[4] _spawn
@ ./process.jl:138 [inlined]
[5] #open#734
@ ./process.jl:393 [inlined]
[6] open (repeats 2 times)
@ ./process.jl:383 [inlined]
[7] launch(manager::SlurmManager, params::Dict{Symbol, Any}, instances_arr::Vector{WorkerConfig}, c::Condition)
@ ClusterManagers ~/.julia/packages/ClusterManagers/S7Syg/src/slurm.jl:60
[8] (::Distributed.var"#43#46"{SlurmManager, Condition, Vector{WorkerConfig}, Dict{Symbol, Any}})()
@ Distributed ./task.jl:484
It seems there might be a problem with the path, but I have no clue what to do since I have very poor knowledge of computer science. Can anyone suggest a solution? Thanks in advance.