How set the MPI.jl environment and how to know which compiler is using?

mpi

#1

Hello,

I’m new in MPI for Julia and I want to use it for my model. My problem is related with the linking between the package folder .julia/v6.0/MPI with the MPI. It seems that Julia lost the link and I don’t know how to link it manually.
When I try to From Julia prompt the message is

julia> include(“01-hello.jl”)
[linux-5hui:02260] mca: base: component_find: unable to open /usr/lib64/mpi/gcc/openmpi/lib64/openmpi/mca_btl_usnic: libpsm_infinipath.so.1: cannot open shared object file: No such file or directory (ignored)
[linux-5hui:02260] mca: base: component_find: unable to open /usr/lib64/mpi/gcc/openmpi/lib64/openmpi/mca_mtl_ofi: libpsm_infinipath.so.1: cannot open shared object file: No such file or directory (ignored)
[linux-5hui:02260] mca: base: component_find: unable to open /usr/lib64/mpi/gcc/openmpi/lib64/openmpi/mca_mtl_psm: libpsm_infinipath.so.1: cannot open shared object file: No such file or directory (ignored)
Hello world, I am 0 of 1

And If I try to execute from the terminal the same example the message is

[mmonterr@linux-5hui ~/.julia/v0.6/MPI/examples]$ mpirun -np 4 julia 01-hello.jl
ERROR: LoadError: ArgumentError: module MPI not found in current path.
Run Pkg.add(“MPI”) to install the MPI package.
while loading /home/mmonterr/.julia/v0.6/MPI/examples/01-hello.jl, in expression starting on line 1
Primary job terminated normally, but 1 process returned
a non-zero exit code… Per user-direction, the job has been aborted.
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[55478,1],0]
Exit code: 1

Also I have tried to certain environment variables creating .juliarc script as in the documentation describes but also without success. Thank you.
Marisol


#2

Could you post you .juliarc script? It looks like something weird is happening with your package configuration.

Two other things to try:

  julia ./01-hello.jl

and

  mpirun -np 1 julia ./01-hello.jl

If you post the output of these commands, I might be able to give further suggestions.


#3

I suggest executing Pkg.checkout(“MPI”) and Pkg.build(“MPI”) from the Julia prompt, then try
mpirun -np 4 julia 01-hello.jl from the system prompt. I have found that when the installed MPI libraries are updated, it’s sometimes necessary to rebuild the package.


#4

Hi,
I have had installed a new Julia version using the command: zypper install julia, which is linked to Julia v0.4.7. I know that this not the suitable version for MPI package, however I obtained the following output when adding MPI (Pkg.add(“MPI”)):

julia> Pkg.add(“MPI”)
INFO: Installing BinDeps v0.4.7
INFO: Installing Compat v0.26.0
INFO: Installing MPI v0.5.1
INFO: Installing SHA v0.3.3
INFO: Installing URIParser v0.1.8
INFO: Building MPI
INFO: Attempting to Create directory /home/mmonterr/.julia/v0.4/MPI/deps/build
INFO: Changing Directory to /home/mmonterr/.julia/v0.4/MPI/deps/build
– The Fortran compiler identification is GNU 4.8.5
– The C compiler identification is GNU 4.8.5
– Check for working Fortran compiler: /usr/bin/gfortran
– Check for working Fortran compiler: /usr/bin/gfortran – works
– Detecting Fortran compiler ABI info
– Detecting Fortran compiler ABI info - done
– Checking whether /usr/bin/gfortran supports Fortran 90
– Checking whether /usr/bin/gfortran supports Fortran 90 – yes
– Check for working C compiler: /usr/bin/cc
– Check for working C compiler: /usr/bin/cc – works
– Detecting C compiler ABI info
– Detecting C compiler ABI info - done
– Detecting C compile features
– Detecting C compile features - done
– Found Git: /usr/bin/git (found version “2.12.3”)
– Found MPI_C: /usr/lib64/mpi/gcc/openmpi/lib64/libmpi.so
– Found MPI_Fortran: /usr/lib64/mpi/gcc/openmpi/lib64/libmpi_usempi.so;/usr/lib64/mpi/gcc/openmpi/lib64/libmpi_mpifh.so;/usr/lib64/mpi/gcc/openmpi/lib64/libmpi.so
– Detecting Fortran/C Interface
– Detecting Fortran/C Interface - Found GLOBAL and MODULE mangling
– Looking for MPI_Comm_c2f
– Looking for MPI_Comm_c2f - found
– Configuring done
– Generating done
– Build files have been written to: /home/mmonterr/.julia/v0.4/MPI/deps/build
Scanning dependencies of target gen_constants
[ 12%] Building Fortran object CMakeFiles/gen_constants.dir/gen_constants.f90.o
[ 25%] Linking Fortran executable gen_constants
[ 25%] Built target gen_constants
Scanning dependencies of target version
[ 25%] Built target version
Scanning dependencies of target gen_functions
[ 37%] Building C object CMakeFiles/gen_functions.dir/gen_functions.c.o
[ 50%] Linking C executable gen_functions
[ 50%] Built target gen_functions
Scanning dependencies of target mpijl-build
[ 62%] Generating mpi-build.jl
[ 62%] Built target mpijl-build
Scanning dependencies of target mpijl
[ 75%] Generating compile-time.jl
[ 75%] Built target mpijl
Scanning dependencies of target juliampi
[ 87%] Building Fortran object CMakeFiles/juliampi.dir/test_mpi.f90.o
[100%] Linking Fortran shared library libjuliampi.so
[100%] Built target juliampi
[ 25%] Built target gen_constants
[ 25%] Built target version
[ 50%] Built target gen_functions
[ 62%] Built target mpijl-build
[ 75%] Built target mpijl
[100%] Built target juliampi
Install the project…
– Install configuration: “”
– Installing: /home/mmonterr/.julia/v0.4/MPI/deps/src/./compile-time.jl
– Installing: /home/mmonterr/.julia/v0.4/MPI/deps/usr/lib/libjuliampi.so
– Set runtime path of “/home/mmonterr/.julia/v0.4/MPI/deps/usr/lib/libjuliampi.so” to “/usr/lib64/mpi/gcc/openmpi/lib64”

After that, I run the following example: mpirun -n 2 julia 01-hello.jl , which produces next output:

[linux-5hui:05919] mca: base: component_find: unable to open /usr/lib64/mpi/gcc/openmpi/lib64/openmpi/mca_btl_usnic: libpsm_infinipath.so.1: cannot open shared object file: No such file or directory (ignored)
[linux-5hui:05920] mca: base: component_find: unable to open /usr/lib64/mpi/gcc/openmpi/lib64/openmpi/mca_btl_usnic: libpsm_infinipath.so.1: cannot open shared object file: No such file or directory (ignored)
[linux-5hui:05919] mca: base: component_find: unable to open /usr/lib64/mpi/gcc/openmpi/lib64/openmpi/mca_mtl_ofi: libpsm_infinipath.so.1: cannot open shared object file: No such file or directory (ignored)
[linux-5hui:05919] mca: base: component_find: unable to open /usr/lib64/mpi/gcc/openmpi/lib64/openmpi/mca_mtl_psm: libpsm_infinipath.so.1: cannot open shared object file: No such file or directory (ignored)
[linux-5hui:05920] mca: base: component_find: unable to open /usr/lib64/mpi/gcc/openmpi/lib64/openmpi/mca_mtl_ofi: libpsm_infinipath.so.1: cannot open shared object file: No such file or directory (ignored)
[linux-5hui:05920] mca: base: component_find: unable to open /usr/lib64/mpi/gcc/openmpi/lib64/openmpi/mca_mtl_psm: libpsm_infinipath.so.1: cannot open shared object file: No such file or directory (ignored)
Hello world, I am 1 of 2Hello world, I am 0 of 2

It seems MPI is working, but I don’t understand the errors… do you some idea? I not sure if the julia versions (0.6 or 0.4) are the reason of the problem.

Furthermore, If I download the julia version from scratch (0.6) the MPI package doesn’t find my openmpi version… all my test were carried out on opensuse.

Thank you!


#5

The errors are probably this, a known problem with OpenMPI.

I’m not sure what the problem is with Julia 0.6. The error it is reporting is that it can’t find MPI.jl (the julia package), not the OpenMPI installation. Are you sure your home directory is mounted on the compute nodes? I think there were some changes to how Julia loads packages in 0.6.


#6

Hi Jared,

If I install Julia v.0.6.0 the message that I obtained is:

julia> Pkg.add(“MPI”)
INFO: Installing MPI v0.6.0
INFO: Building MPI
INFO: Attempting to Create directory /home/mmonterr/.julia/v0.6/MPI/deps/build
INFO: Changing Directory to /home/mmonterr/.julia/v0.6/MPI/deps/build
INFO: Package database updated

I don’t know if the home directory is mounted on the compute nodes… How I can know it?
Thank you very much


#7

According to that last output, it appears that the MPI package installed correctly with julia v0.6. Does the 01-hello example run as expected? If so, then everything is OK.

The issue of /home being mounted or not on compute nodes is only relevant if your call to mpirun is asking to use nodes, and from the above, you’re not doing that. I.e., you are doing

mpirun -n 2 julia 01-hello.jl

which just runs multiple processes on the host computer. This works fine on multicore computers.

To use other nodes, on a cluster, the command would be something like
mpirun -n 2 --hostfile FILENAME julia 01-hello.jl

In that case, you need to have all of the files available on all of the nodes, and sharing /home is the easiest way to do it.


#8

@mcreel I agree with what you say. Yes, having /home shared on all the compute servers is the easiest way.
In fact, in my epxerience on clusters is that if $HOME is not available on compute servers, or something which pretends to be it, then lots of things break. You might think it is possble to engineer a cluster without a shared home, but things do break (not in particualr Julia)

Getting to my point, I had a discussion about the Pkg3 mechanism a few weeks ago. We will need to have a Julia which is flexible enough to be installed on a shared NFS area separate from /home in order for it to be useable on HPC clusters.
I beleive Pkg3 will make this a lot more manageable.


#9

Hello everybody,
when I try to execute the example I obtain:

[mmonterr@linux-5hui ~/.julia/v0.6/MPI/examples]$ mpirun -n 2 julia 01-hello.jl

mpirun was unable to find the specified executable file, and therefore
did not launch the job. This error was first reported for process
rank 0; it may have occurred for other processes as well.

NOTE: A common cause for this error is misspelling a mpirun command
line parameter option (remember that mpirun interprets the first
unrecognized command line token as the executable).

Node: linux-5hui
Executable: julia

2 total processes failed to start

It seems that something is wrong with mpirun :frowning:


#10

The error message says mpirun could not find the julia executable. This doesn’t sound like a problem with mpirun but with your PATH environmental variable. I quick test would be to do mpirun -n 2 /full/path/to/julia (specifying the full path avoids needing to use the PATH to look up the location of the executable). If this is the problem, a quick google search will show you how to environmental variables.


#11

Dear Jared,

I solve the problem thanks to your comment! Now I can execute the MPI.jl examples
Thank you very much.


#12

You’re welcome. Best of luck with your computations.