MPI.jl tasks in multiple nodes

I want to submit tasks in 4 nodes, with each node 10 process.
However the following script only runs on 1 node:

bsub -J kongdd_m01 \
     -o log_all.out \
     -e log_all.err \
     -n 40 \
     -R "span[ptile=10]" \
     mpiexecjl -n 4 --map-by node julia -t 10 ex1_Threads.jl
cpus_per_node=32
nodes=4
cpus=$((threads_per_node * nodes))

bsub -J kongdd_m01 \
     -o log_all.out \
     -e log_all.err \
     -n $cpus \
     -R "span[ptile=$cpus_per_node]" \
     mpiexecjl -n $nodes -ppn 1 julia -t $cpus_per_node ex1_Threads.jl
1 Like