Location of libopenblas

Tim_Kelley · April 14, 2019, 10:31pm

I’m trying to make a direct call to dgemm. Julia can’t seem to find the library, which on my mac is in .julia/conda/3/lib instead of /usr/local/lib. Did I make an error in the install?

When I look at the sourse for gemm for guidance, it seems the libarary should be called libopenblas64.

As you can see, I am very confused. If I have to reinstall from scratch, I will, but want to do it right.

Thanks,

Elrod · April 14, 2019, 11:31pm

If you’re making the call from within Julia, I would just do it the same way the standard library does:

Use:

julia> BLAS.@blasfunc :dgemm_
Symbol(":dgemm_64_")

julia> BLAS.libblas
"libopenblas64_"

EDIT: This has the benefit of also working for folks who build with MKL instead.

Tim_Kelley · April 15, 2019, 2:04pm

Thanks, Should the call look like this for square matrices A, B, and C?

libblas=BLAS.libblas
dgemm=BLAS.@blasfunc :dgemm_
ccall((dgemm, BLAS.libblas), Cvoid,
(Ref{UInt8}, Ref{UInt8}, Ref{BlasInt}, Ref{BlasInt},
Ref{BlasInt}, Ref{Float64}, Ptr{Float64}, Ref{BlasInt},
Ptr{Float64}, Ref{BlasInt}, Ref{Float64}, Ptr{:Float64},
Ref{BlasInt}),
‘N’, ‘N’, rowsofA, colsofB,
seconddimofA, alpha, A, leadingdimofA,
B, leadingdimofB, beta, C, leadingdimofC )

When I run it from the REPL things seem to break with
ERROR: TypeError: in ccall: first argument not a pointer or valid constant expression, expected Ptr, got Tuple{Symbol,String}

If I embed it in a function I get a segmentation fault. I know I’m missing something fundamental, but can’t figure out what it is.

Tim_Kelley · April 17, 2019, 7:59pm

I finally understood your advice and figure out how to do it. This works
function klgemv!(trans, m, n, alpha, A, lda, X, incx, beta, Y, incy)

ccall(( (BLAS.@blasfunc dgemv_ ), Base.libblas_name), Cvoid,
(Ref{UInt8}, Ref{Int64}, Ref{Int64}, Ref{Float64},
Ptr{Float64}, Ref{Int64}, Ptr{Float64}, Ref{Int64},
Ref{Float64}, Ptr{Float64}, Ref{Int64}),
trans, m, n, alpha,
A, lda, X, incx,
beta, Y, incy)
return Y
end

I’m using this to implement the Arnoldi factorization using classical Gram-Schmidt twice, which is both stable and, if you have four or more cores, faster than modified GS. I’ve been testing it for correctness against the qr! function in Julia and qr in Matlab. This is not a good way to compute a QR factorization and one expects to be slower than the QR in lapack. For small problems the cgs way is very competitive with QR, but for large ones Lapack can use blas3 calls, and the speed gets a serious boost. You see this in both Matlab and Julia. What I did not expect was that the Julia versions were so much faster. For an 20000x800 matrix, qr! in Julia was over 20x faster that qr in Matlab. My cgs version was over 8x faster in Julia (once I figured out how to reduce the allocation burden with views).

I stand impressed.

Tim_Kelley · August 22, 2019, 8:56pm

Hi, I’m bring this thread back to live to get some advice.

I’m trying to avoid the ccall to blasfunc and am getting similar
performance (but a higher allocation burden) using

So, here’s a QR code that does this in two ways, one with BLAS calls
and one way without. When I use do not use BLAS calls I’m getting killed with
allocations on the two lines that update the new column:

qk.-=Qkm*rk
qk.-=Qkm*pk

Is there something I’m doing wrong here? Is there an obvious way to
reduce the allocation burden?

'preciate it,

– Tim

function classical2!(A)
(m,n)=size(A)
precision=typeof(A[1,1])
R=precision.(zeros(n,n))
R[1,1]=norm(A[:,1])
A[:,1]=A[:,1]/R[1,1]
#
# Turn on the BLAS calls
#
doblas=1
#
# Compute the factorization with CGS twice.
#
@views for k=2:n
    rk=R[1:k-1,k]
    qk=A[:,k]
    Qkm=A[:,1:k-1]
    pk=zeros(size(rk))
if doblas==0
#
#   no BLAS
#
# Orthogonalize
    rk.+=Qkm'*qk
    qk.-=Qkm*rk
# Orthogonalize again
    pk.=Qkm'*qk
    qk.-=Qkm*pk
    rk.+=pk
else
#
#   BLAS
#
# Orthogonalize
    BLAS.gemv!('T',1.0,Qkm,qk,1.0,rk)
    BLAS.gemv!('N',-1.0,Qkm,rk,1.0,qk)
# Orthogonalize again
    BLAS.gemv!('T',1.0,Qkm,qk,0.0,pk)
    BLAS.gemv!('N',-1.0,Qkm,pk,1.0,qk)
    rk.+=pk
#
end
    R[k,k]=norm(qk)
    qk./=R[k,k]
end
return QR = (Q=A, R=R)
end

Topic		Replies	Views
BLAS headers General Usage fortran , blas	5	1539	April 25, 2017
How to call cuda functions that are not implemented in CUDA.jl? GPU	8	1213	September 25, 2020
An incorrect result warning for OpenBLAS Archlinux users General Usage	2	948	June 11, 2020
Documentation for `blasfunc` and `@blasfunc` General Usage question	4	945	December 16, 2017
Blas version on MacOS Internals & Design	2	1353	June 7, 2018

Location of libopenblas

Related topics