Unified Interface for Linear Solving


#1

Building off of the rootfinding topic, I would like to propose some common interface for linear solving in Julia for the same reasons: allowing packages which use linear solvers to easy “swap out methods”. While \ is nice for many, it’s well known there exists many different methods we would like to be able to swap around, many existing in different packages. These include:

  1. Direct solvers (\)
  2. Sparse direct solvers (is there a package in Julia?)
  3. IterativeSolvers.jl
  4. PETSc.jl
  5. MUMPS.jl
  6. Sparso.jl
  7. CUSOLVER.jl
  8. ArrayFire.jl
  9. PyAMG.jl
  10. Pardiso.jl

(Any more to add?)

It would be nice to have one unified way to do:

linear_solve(A,b,alg(),kwargs...)

For example, one possibility could be:

linear_solve(A,b,cg!(alg_specific_option=...),abstol=1e-6)

to use the cg! method from IterativeSolvers.jl. Again, this would be really nice because then algorithms could be written at a high level without reference to a specific linear solving package, and users could then pick what fits the problem (including GPU algorithms like CUBLAS!).

Some problems to hammer out include:

  1. What about matrix-free cases? Just let A be a function?
  2. Etc.

#2

We do have one unified way: , or A_ldiv_B! if you want to pre-allocate the output.

You swap out solvers just by changing the matrix type. e.g. sparse-direct solvers are accessed by using a sparse matrix. You can also use various factorization objects, e.g. qrfact(A), if you want a particular factorization algorithm.

You get PETSc.jl solvers if you use a PETSc matrix (which can be wrapped around a Julia matrix in the serial case). You can specify more PETSc solver options by wrapping a PETSc solver object around your (PETSc) matrix, e.g.

K = KSP(A, ksp_type="gmres", ksp_rtol=1e-6)
x = K \ b # solve Ax=b

#3

But is there a good way to utilize this in a way where a user can pass in what method they want to do the linear solving with? If I am understanding the correctly, you mean that what the user should pass in is the function linear_solve, which you use like:

K = linear_solve(A) # This function is passed in
x  = K\b

and the user interface would be like:

f(x,linear_solve = (A) -> KSP(A, ksp_type="gmres", ksp_rtol=1e-6))

to pass in the solving method? It seems that works.


#4

Yes, you would mysolver(A) with backslash. Why wouldn’t you just call f(x, mysolver(A)) for a function f that needed the solver, like eigs?


#5

I am thinking about the common case where you’re asking for the method for doing the lienar solving which will be repeated open during some calculation. For example, what linear solver to use when solving a numerical ODE with a Rosenbrock method. That would need a function since it would need to know how to solve the linear system resulting from the Jacobian each iteration, so you cannot just pass a matrix.

Is there any way to do this with IterativeSolvers.jl? I.e. pass a function such that linear_solve(A) will then be \ and apply say GMRES?


#6

No, I don’t think iterativesolvers.jl uses , though I think it maybe should. e.g. gmres(A, ...options...) without a rhs argument could return a solver object that works with backslash.


#7

I opened an issue to discuss this with IterativeSolvers:

Also, I put out a blog post explaining the benefits this kind of interface would yield:


#8

Another package I hope to unify here would be Intel’s Sparso. It would be nice to have very fast sprase solvers included in the mix of tools available through the same API. I opened an issue for this.


#9

At least in CUSPARSE.jl, I’ve extended ldiv (since they do have a linear solve method). It’s a bit weird there because of the CUSPARSE API. I’d like to do the same for CUSOLVER (an apropos name) but haven’t had the time, if anyone would like to open some PRs they’re most welcome to do so.


#10

Ahh yes, how did I forget the GPU libraries? Added them to the list.

After playing with it a bunch, I think the:

  1. Make a type
  2. Dispatch \

way of doing this is very clean and very smart. Direct methods like lufact get to do the factorization at the type-making stage, and thus users/developers who use this get the “free” benefit that the factorization is only done once. the GPU libraries could also have a dispatch to send the matrix to the GPU here (if it’s a matrix on the CPU), making that data transfer only happen once even if there are multiple \.

So it hides extra optimizations in there. Then the resulting usage is just \ which is probably the most concise syntax you can have. This is really clever!

Getting the iterative solver packages in line with this setup is probably the only thing that needs to be done then.


#11

In ApproxFun, I overloaded \ for operators but allow passing in parameters such as tolerances:

\(A,b; tolerance=eps())

This pattern would also make sense for iterative solvers:

A = gmresfact(A,b; restarted=true)
\(A, b; tolerance=eps())

#12

If the type held the information for the solving args, then straight usage of \ would be possible:

A = gmresfact(A,b; restarted=true,tolerance=eps())
A \ b

This way the type construction and setup can be package-based, but \ is still universal.

This is the PETSc.jl solution and I think it makes it more extendable. Otherwise you’d have to hope the tolerance arg is the same for every package (or exists for every package) if you use `

\(A, b; tolerance=eps())

in your code.


#13

The motivation for passing it as a parameter is that the tolerance is a property of the solution, not the factorization. The same GMRES factorization (or in my case, adaptive QR factorization) can be reused with different tolerances, adaptively growing the Krylov subspace as necessary.


#14

Actually, I think your approach is better: in most cases where you reuse a factorization, you would want to use the same tolerance.

The tolerance of the factorization could be thought of as a default tolerance, which could be overriden:

G = gmresfact(A,b; tolerance=eps())
G \ b # uses default tolerance
\(G,b; tolerance=1E-3)  # overrides default tolerance

but this may be overkill.


#15

Couldn’t you just mutate it? A.tolerance = 2*eps().


#16

Yes…but then it’s not functional programming :disappointed:


#17

Why the focus on "\"? Is the backslash operator purely for Matlab compatibility or does it go back further than that? Taking inspiration from the Thyra package in Trilinos, a generic interface could look like this:

solve!(x, A, b, parameters)

x and b are (multi)-vector types, A is a matrix or factorization and parameters could implement some abstract base type for parameters of any kind. Following Julia convention the arguments are rearranged because the modified argument must be first, if I understand correctly. We could also have a variant that returns x rather than modifying the passed in value, but for multiple calls in series it could be beneficial to overwrite the existing solution.


#18

Looks similar to what I chose in https://github.com/JuliaSparse/Pardiso.jl#solving


#19

I’ve posted my perspective on the design issue in this Github comment.

The fundamental design problem is that there are three things that need to be specified - A, b and algorithm, and that if you want to write something \ something the only reasonable grouping is (A, algorithm) \ b. The question is balancing the convenience of the infix \ syntax with the need to specify three things.


#20

While I am clearly partial to solve style syntax (just look at JuliaDiffEq), it doesn’t hit the necessary points already mentioned in this thread. To summarize, a good API for this would need to:

  1. Be able to easily swap between different packages (and Base).
  2. Be able to “initialize”. This is important for dense solvers since being able to save the factorization can lead to large speedups, and for things like GPU code being able to do the data transfer once is essential.
  3. Have an easy in-place syntax.
  4. Be able to pass parameters.
  5. Not add dependencies.

The problem with solve's syntax is that it does not do 2 well.


Example

To better situate the problem and show the advantages/disadvantages of various setups, let’s have an example problem. This example stems from Rosenbrock ODE solvers, but it’s a pretty common pattern throughout applied math. I would like to be able to write a function that solves a linear system three times, changing the right-hand side each time, and allowing the user to choose the method. Using the “God type” method of @stevengj / PETSc.jl (“God type” naming from @jiahao 's issue. I am not sure of a good name for it because it’s not always a factorization, but I will use the word factorization) , I can do this as follows:

function OneStep(f,...,linear_solver=lufact) 
  ...
  A = Jacobian(f(...)) # A comes from some Jacobian
  K = linear_solver(A) # Build the factorization object
  ... # Some calculations
  @into! x1 = K\b1 # First linear solve
  ... # use x1 a bit
  @into! x2 = K\b2 # Second linear solve
  ... # use x2 a bit
  @into! x3 = K\b3 # third linear solve
  ... # Do some stuff to end
end

First off, let me mention that @into! is from InplaceOps.jl and just changes the linear solves to be an in-place operation via A_ldiv_B!, so it’s just clean syntax. But let’s actually see what that does. If one uses a dense matrix solver like lufact or qrfact, then this algorithm will solve the factorization once, and then re-use it in all 3 cases. This is then an efficient code for dense solvers. But in many cases, a Jacobian may be sprase. If someone passes in

OneStep(f,...,(A) -> KSP(A, ksp_type="gmres", ksp_rtol=1e-6))

then all of the linear solving places are replaced by PETSc.jl’s GMRES, with no modification to the code necessary.

Now let’s see how other packages could really build off of this. Take @kslimes 's CUSOLVER.jl. If this package provided an interface which had such types and dispatches, then the following could occur. The type generation could be some function like

linear_solve = (A) -> CudaSolve(A,csrlsvlu(),abstol=1e-6)

and which could send the (hopefully) sparse matrix to the GPU and save it as a sparse matrix in a CUDA format. Then \ (and A_ldiv_B!) can be overloaded to

  1. Send the vector to the GPU (if the vector is not a GPU vector. This “if” is clearly dispatch)
  2. Compute \ using the chosen algorithm.
  3. If it started on the CPU, send the resulting vector back to the CPU.

Now, when a user does

OneStep(f,...,(A) -> CudaSolve(A,csrlsvlu(),abstol=1e-6))

then, without any modifications to OneStep, the code will do the linear solves on the GPU and only send the Jacobian to the GPU a single time.

Then if Pardiso.jl had an object setup which then just calls solve, or MUMPS.jl, or …, then all of these codes can easily be exchanged. Of course, you’d have to check the different packages for how to specify the tolerances each time (since they could name them literally anything), but in the end you can use anything. The package which has the OneSteps function, in this setup, does have to depend on any of the linear solvers and so the user can give it any function which produces a type which can \.

The problem with solve is I don’t see how it can match this two-step approach in an easy manner and get the same efficiency. It can get lots of these things right, but the initialization is really crucial for many linear solving setups (not for iterative solvers, but just thinking about CPU-based iterative solvers is really close-minded here). And I don’t see why you’d want to avoid : MATLAB had some bad syntax choices, but \ is one choice that I find to be very clear and elegant.