@mcreel, that’s almost the right answer – you sent me down the right path. I wasn’t using the paradigm you linked, where only workers use MPI code. I was using instead, the one where the master and workers all use MPI through TCP/IP transport.
When I switched to workers-only-use MPI, this MWE produces results like I expected:
using MPIClusterManagers, Distributed
manager = MPIManager(np=4)
addprocs(manager)
@info "workers are $(workers())"
exit()
I get [Info: workers are [2,3,4,5]
Please note that this is NOT run through mpirun. I simply ran this, after requesting PBS for 4 cpus, and then doing
julia myscript.jl >& outfile
My parallel Julia code with @sync @async and remotecall_fetch worked as expected with near 90% CPU utilisation on all 4 requested cores on the PBS assigned node,