Location of libopenblas

I’m trying to make a direct call to dgemm. Julia can’t seem to find the library, which on my mac is in .julia/conda/3/lib instead of /usr/local/lib. Did I make an error in the install?

When I look at the sourse for gemm for guidance, it seems the libarary should be called libopenblas64.

As you can see, I am very confused. If I have to reinstall from scratch, I will, but want to do it right.

Thanks,

If you’re making the call from within Julia, I would just do it the same way the standard library does:

Use:

julia> BLAS.@blasfunc :dgemm_
Symbol(":dgemm_64_")

julia> BLAS.libblas
"libopenblas64_"

EDIT: This has the benefit of also working for folks who build with MKL instead.

1 Like

Thanks, Should the call look like this for square matrices A, B, and C?

libblas=BLAS.libblas
dgemm=BLAS.@blasfunc :dgemm_
ccall((dgemm, BLAS.libblas), Cvoid,
(Ref{UInt8}, Ref{UInt8}, Ref{BlasInt}, Ref{BlasInt},
Ref{BlasInt}, Ref{Float64}, Ptr{Float64}, Ref{BlasInt},
Ptr{Float64}, Ref{BlasInt}, Ref{Float64}, Ptr{:Float64},
Ref{BlasInt}),
‘N’, ‘N’, rowsofA, colsofB,
seconddimofA, alpha, A, leadingdimofA,
B, leadingdimofB, beta, C, leadingdimofC )

When I run it from the REPL things seem to break with
ERROR: TypeError: in ccall: first argument not a pointer or valid constant expression, expected Ptr, got Tuple{Symbol,String}

If I embed it in a function I get a segmentation fault. I know I’m missing something fundamental, but can’t figure out what it is.

I finally understood your advice and figure out how to do it. This works
function klgemv!(trans, m, n, alpha, A, lda, X, incx, beta, Y, incy)

ccall(( (BLAS.@blasfunc dgemv_ ), Base.libblas_name), Cvoid,
(Ref{UInt8}, Ref{Int64}, Ref{Int64}, Ref{Float64},
Ptr{Float64}, Ref{Int64}, Ptr{Float64}, Ref{Int64},
Ref{Float64}, Ptr{Float64}, Ref{Int64}),
trans, m, n, alpha,
A, lda, X, incx,
beta, Y, incy)
return Y
end

I’m using this to implement the Arnoldi factorization using classical Gram-Schmidt twice, which is both stable and, if you have four or more cores, faster than modified GS. I’ve been testing it for correctness against the qr! function in Julia and qr in Matlab. This is not a good way to compute a QR factorization and one expects to be slower than the QR in lapack. For small problems the cgs way is very competitive with QR, but for large ones Lapack can use blas3 calls, and the speed gets a serious boost. You see this in both Matlab and Julia. What I did not expect was that the Julia versions were so much faster. For an 20000x800 matrix, qr! in Julia was over 20x faster that qr in Matlab. My cgs version was over 8x faster in Julia (once I figured out how to reduce the allocation burden with views).

I stand impressed.

Hi, I’m bring this thread back to live to get some advice.

I’m trying to avoid the ccall to blasfunc and am getting similar
performance (but a higher allocation burden) using

So, here’s a QR code that does this in two ways, one with BLAS calls
and one way without. When I use do not use BLAS calls I’m getting killed with
allocations on the two lines that update the new column:

qk.-=Qkm*rk
qk.-=Qkm*pk

Is there something I’m doing wrong here? Is there an obvious way to
reduce the allocation burden?

'preciate it,

– Tim


function classical2!(A)
(m,n)=size(A)
precision=typeof(A[1,1])
R=precision.(zeros(n,n))
R[1,1]=norm(A[:,1])
A[:,1]=A[:,1]/R[1,1]
#
# Turn on the BLAS calls
#
doblas=1
#
# Compute the factorization with CGS twice.
#
@views for k=2:n
    rk=R[1:k-1,k]
    qk=A[:,k]
    Qkm=A[:,1:k-1]
    pk=zeros(size(rk))
if doblas==0
#
#   no BLAS
#
# Orthogonalize
    rk.+=Qkm'*qk
    qk.-=Qkm*rk
# Orthogonalize again
    pk.=Qkm'*qk
    qk.-=Qkm*pk
    rk.+=pk
else
#
#   BLAS
#
# Orthogonalize
    BLAS.gemv!('T',1.0,Qkm,qk,1.0,rk)
    BLAS.gemv!('N',-1.0,Qkm,rk,1.0,qk)
# Orthogonalize again
    BLAS.gemv!('T',1.0,Qkm,qk,0.0,pk)
    BLAS.gemv!('N',-1.0,Qkm,pk,1.0,qk)
    rk.+=pk
#
end
    R[k,k]=norm(qk)
    qk./=R[k,k]
end
return QR = (Q=A, R=R)
end