AOCL (not MKL) acceleration on AMD Ryzen CPU's

Thank you, @Elrod .
First of all, an internet search on dnrm2, leads to some pages where dnrm2 is described as:

 *> DNRM2 returns the euclidean norm of a vector via the function
   27 *> name, so that
   28 *>
   29 *>    DNRM2 := sqrt( x'*x )

And it is part of the LAPACK, but in the BLAS library.

Inspecting the libflame.so as suggested:

$ nm -D libflame.so | grep dnrm2
00000000002a73f0 T bl1_dnrm2
                 U dnrm2_

According to the nm doumentation, U means “The symbol is undefined”, what, in turn, I think, means that the code for that function is not in libflame.so. This is what I expected, if dnrm2 is in BLAS. So, I inspected the libblis.so (single thread) and libblis-mt.so (multi threaded) libraries, and got:

$ nm -D libblis.so | grep dnrm2_
0000000000930100 T dnrm2_
0000000000930090 T dnrm2_blis_impl

and

$ nm -D libblis-mt.so | grep dnrm2
000000000094a7d0 T cblas_dnrm2
00000000009a9fb0 T dnrm2
000000000092dd70 T dnrm2_
000000000092dd00 T dnrm2_blis_impl
00000000009ab780 T dnrm2sub
0000000000944aa0 T dnrm2sub_
0000000000944a90 T dnrm2sub_blis_impl

respectively. According to the nm doumentation, “T” or “t” means: “The symbol is in the text (code) section.”

So it exists, but not in libflame.so, but in libblis.so and libblis-mt.so. So, I have to link to any of these libraries (well I am interested int he multi threaded one).

As suggested in the suggested thread, I have tested my LD_LIBRARY_PATH, getting:

/opt/AMD/aocl/aocl-linux-aocc-4.2.0/aocc/lib:/opt/AMD/aocc-compiler-4.2.0/ompd:/opt/AMD/aocc-compiler-4.2.0/lib:/opt/AMD/aocc-compiler-4.2.0/lib32:/usr/lib/x86_64-linux-gnu:/usr/lib64:/usr/lib32:/usr/lib

Going to the first listed directory:

$ ls
cmake                     libblis-mt.so            libfftw3l_mpi.la         libfftw3q_omp.so.3
libalcp.a                 libblis-mt.so.4          libfftw3l_mpi.so         libfftw3q_omp.so.3.6.10
libalcp.so                libblis-mt.so.4.2.0      libfftw3l_mpi.so.3       libfftw3q.so
libalm.a                  libblis.so               libfftw3l_mpi.so.3.6.10  libfftw3q.so.3
libalmfast.a              libblis.so.4             libfftw3l_omp.a          libfftw3q.so.3.6.10
libalmfast.so             libblis.so.4.2.0         libfftw3l_omp.la         libfftw3.so
libalm.so                 libbz2.so                libfftw3l_omp.so         libfftw3.so.3
libamdlibm.a              libfftw3.a               libfftw3l_omp.so.3       libfftw3.so.3.6.10
libamdlibmfast.a          libfftw3f.a              libfftw3l_omp.so.3.6.10  libflame.a
libamdlibmfast.so         libfftw3f.la             libfftw3l.so             libflame.so
libamdlibm.so             libfftw3f_mpi.a          libfftw3l.so.3           libipp-compat.so
libamdsecrng.a            libfftw3f_mpi.la         libfftw3l.so.3.6.10      liblz4.so
libamdsecrng.so           libfftw3f_mpi.so         libfftw3_mpi.a           liblzma.so
libamdsecrng.so.4.2       libfftw3f_mpi.so.3       libfftw3_mpi.la          libopenssl-compat.so
libamdsecrng.so.4.2.0     libfftw3f_mpi.so.3.6.10  libfftw3_mpi.so          librng_amd.a
libaocl_compression.a     libfftw3f_omp.a          libfftw3_mpi.so.3        librng_amd.so
libaocl_compression.so    libfftw3f_omp.la         libfftw3_mpi.so.3.6.10   librng_amd.so.4.2
libaocl-libmem.a          libfftw3f_omp.so         libfftw3_omp.a           librng_amd.so.4.2.0
libaocl-libmem.so         libfftw3f_omp.so.3       libfftw3_omp.la          libscalapack.a
libaocl-libmem.so.4.2.0   libfftw3f_omp.so.3.6.10  libfftw3_omp.so          libscalapack.so
libaoclsparse.a           libfftw3f.so             libfftw3_omp.so.3        libsnappy.so
libaoclsparse.so          libfftw3f.so.3           libfftw3_omp.so.3.6.10   libz.so
libaoclsparse.so.4.2.0.0  libfftw3f.so.3.6.10      libfftw3q.a              libzstd.so
libaoclutils.a            libfftw3.la              libfftw3q.la             pkgconfig
libaoclutils.so           libfftw3l.a              libfftw3q_omp.a
libblis.a                 libfftw3l.la             libfftw3q_omp.la
libblis-mt.a              libfftw3l_mpi.a          libfftw3q_omp.so

We can see libblis.so and libblis-mt.so, so the LD_LIBRARY_PATH is correct.

I addition, lets see if libflame.so knows where libblis is:

$ ldd libflame.so 
	linux-vdso.so.1 (0x00007ffd9d4eb000)
	libaoclutils.so => /opt/AMD/aocl/aocl-linux-aocc-4.2.0/aocc/lib/libaoclutils.so (0x0000737c0e67c000)
	libomp.so => /opt/AMD/aocc-compiler-4.2.0/lib/libomp.so (0x0000737c0d200000)
	libpthread.so.0 => /usr/lib/x86_64-linux-gnu/libpthread.so.0 (0x0000737c0e677000)
	libc.so.6 => /usr/lib/x86_64-linux-gnu/libc.so.6 (0x0000737c0ce00000)
	/lib64/ld-linux-x86-64.so.2 (0x0000737c0e688000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x0000737c0ca00000)
	libm.so.6 => /usr/lib/x86_64-linux-gnu/libm.so.6 (0x0000737c0d519000)
	libgcc_s.so.1 => /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 (0x0000737c0e655000)
	librt.so.1 => /usr/lib/x86_64-linux-gnu/librt.so.1 (0x0000737c0e650000)
	libdl.so.2 => /usr/lib/x86_64-linux-gnu/libdl.so.2 (0x0000737c0e64b000)

There are not unresolved directions, but it does not show any dependency on libblis.so or libblis-mt.so (well, my understanding on ldd is not precisely expert, but I think this is the interpretation).

So, the last option, I think, is to explicitly link the libblis-mt.so. According to the AOCL documentation:

To use AOCL-LAPACK in your application, link with AOCL-LAPACK, AOCL-BLAS, and AOCL-Utils libraries while building the application.

AOCL-Utils library has libstdc++ library dependency. As AOCL-LAPACK is dependent on AOCL-Utils, applications must link with libstdc++(-lstdc++) as well.

But, How do I link to several shared libraries in Julia? Is there a way to do this with ccall or do I have to make weird compilation or pre-compilation things?

2 Likes