[ANN] FastLapackInterface.jl v1.0.0: Non-allocating LAPACK factorizations

The DynareJulia team and I are excited to announce the first major release of FastLapackInterface.jl. The package facilitates the pre-allocation of workspaces to be used with some of the most important LAPACK functions with respect to factorizations. This eliminates all major sources of allocations inside the LAPACK functions. The package for now targets QR, Schur and LU factorizations, but functionality may be added in the future.

We have designed the package to be as consistent with the LAPACK functionality in Base julia, making it highly transparent. The QRWs workspace for example can be used with LinearAlgebra.QR as follows:

julia> A = [1.2 2.3
            6.2 3.3]
2×2 Matrix{Float64}:
 1.2  2.3
 6.2  3.3

julia> ws = QRWs(A)
QRWs{Float64}
work: 64-element Vector{Float64}
τ: 2-element Vector{Float64}

julia> t = QR(LAPACK.geqrf!(A, ws)...)
QR{Float64, Matrix{Float64}, Vector{Float64}}
Q factor:
2×2 QRPackedQ{Float64, Matrix{Float64}, Vector{Float64}}:
 -0.190022  -0.98178
 -0.98178    0.190022
R factor:
2×2 Matrix{Float64}:
 -6.31506  -3.67692
  0.0      -1.63102

julia> Matrix(t)
2×2 Matrix{Float64}:
 1.2  2.3
 6.2  3.3

The package is fully tested and documented, and a suite of benchmarks to compare with Base julia LAPACK functions is included.

All the best,
Louis

33 Likes

Thanks a lot for this package! It would be very useful in my applications where I need LAPACK operations for many small matrices.

I have two questions:

  1. Do you have a plan to integrate more functions? Specifically, I use Hermitian diagonalization (syev / heev) a lot.

  2. If I understood the code correctly, currently the non-allocating function throws if the workspace is too small. Would it be possible to have an option to just resize the workspace and run?

2 Likes
  1. Yes! In my own work I also mostly run diagonalizations, and I have implemented in the past something similar to a Workspace for it. We are totally open to adding more functionality, and this would be the first one we’d start with.
  2. You are correct. It’s not a bad idea to indeed support resizing. If you want, could you open an issue on it to further discuss what would be the best way to do this? I’m not sure if doing that “under the hood” is the best idea since maybe that would lead to some unexpected behaviors for users, i.e. too much magic. Especially considering that for some of the factorizations work buffers are not trivially determinable but rather require a LAPACK call themselves (maybe looking at what LAPACK does to figure out the size and doing it manually in julia is possible though). I think just having a resize! function that would take in a new size or matrix would be a good idea.
1 Like

LinearAlgebra functions often involve two LAPACK calls, one to determine the workspace size and the second to actually perform the calculation after resizing the workspace. Having the resizing step built in might not be too surprising.

On another note, I guess pre-allocating a workspace will not be thread-safe?

The whole point of the package is to not have allocations occur during the LAPACK calls.

Why not? Using the same workspace with multiple threads of course won’t be indeed.

I see. My (erroneous) impression was that the package allows one to pre-allocate output matrices, and isn’t fastidious about allocations in resizing the workspace. Thanks for clearing this up!

Yes, I was referring to using the same workspace from multiple threads, which seems unsafe. I guess the solution is to allocate a separate workspace for each thread, and have each task fetch a workspace from a pool.

No problem!

Exactly!

Hi All,

We’re happy to announce an update to FastLapackInterface now including Eigen decompositions!

We added 3 new Workspaces:

  • EigenWs
  • HermitianEigenWs
  • GeneralizedEigenWs

We also came up with a more uniform API that allows to construct the correct Workspace depending on the target LAPACK function, for example:

ws = Workspace(LAPACK.getrf!, A; kwargs...)
factorize!(ws, A)

will automatically create a QRWYWs and call getrf! through factorize!.

Let us know what you think!

Cheers

6 Likes

Hi All,

We have another update to announce, FastLapackInterface version 1.2.0.

Cholesky and BunchKaufman related Workspaces were added, and now each Workspace can be resized using resize!. Automatic resizing functionality while calling the LAPACK functions or factorize!/decompose! is turned on by default, but can be controlled with the resize keyword argument.

i.e.

A = rand(3,3)
ws = Workspace(LAPACK.getrf!, A; kwargs...)
B = rand(4,4)
factorize!(ws, B)

will automatically resize ws appropriately. If instead factorize!(ws, B, resize=false) were used, an error would be thrown.

If you have any feedback let us know!

Cheers,
Louis

3 Likes