It’s hard to say what the issue is without more information. Have you checked that you have successfully launched Julia worker processes on multiple nodes? Just a guess but one potential culprit might be ssh tunneling.
Which version of Julia are you using? Haven’t tried out 1.0 in a multiple machine setting yet but on 0.64 on my research’s group’s cluster, I fail to connect to workers on nodes other than the one hosting the master process using addprocs if I don’t indicate that ssh tunneling is required. E.g. (where tera31 and tera32 are hostnames of two nodes) for me
procs = ["tera31","tera32"]
works. If you’re in an environment where it takes a long time for the connections with remote workers to be established for whatever reason, you can also try setting the JULIA_WORKER_TIMEOUT environment variable on the master process before calling addprocs. This will make Julia wait longer before giving up on connecting to workers.
I can say that just launching julia via julia -p n for some number n. is meant for launching multiple workers on a single machine. To launch workers on multiple machines you need to launch julia with a machine file. This post has an example of how to do that with PBS. That’s about all I know. If that doesn’t get you going I would look around for more resources on/ask for help with getting Julia working with cluster job schedulers.