File organization in Base, and modularity


#1

I have been playing with the code in LinAlg with the aim of replacing Ac_mul_B and so on with calls to * using RowVector, TransposedMatrix and ConjArray. Since this affects almost all of the matrix multiplication code, I had started moving some functions around into a pattern which makes more sense to me. (You can preview this early WIP here, particularly in linalg/matlmul.jl and linalg/blas.jl.)

Before I got carried away with reorganizing things, I wanted to ask what direction we would like to move in (or if the current layout is preferred)? I had read in multiple places that we would like to move to a modular system of standard libraries, and it would be convenient to be able to build Julia without BLAS and still have functioning matrix multiplication (albeit, slower).

The current situation is that we have a module called Base.LinAlg.BLAS which defines wrappers for ccall for the Fortran library. However, specialized dispatches to a given matrix multiplication routine are defined in Base.LinAlg in linalg/matmul.jl for BLAS types, while generic matrix multiplication routines live in generic.jl.

From the perspective of someone who has mostly developed packages outside of Base, this pattern seems rather queer to me. Typically, a module or package extends another module. For instance, I would have LinAlg define generic methods for * and A_mul_B! for matrices, vectors, rowvectors, etc. Then the BLAS module would link to the Fortran code and define specializations of the methods for * and A_mul_B! already defined in LinAlg, dispatching on StridedVecOrMat{<:BlasFloat}. This way the module could be loaded optionally, or after LinAlg (I am aware of the complications of e.g. LAPACK depending on BLAS, and so-on - so this particular case is a bit delicate…).

However, cramming more things in linalg/blas.jl seemed undesirable since you have this monolithic file with ccall wrappers and julia code and so-on, and if it were a package, then such a BLAS.jl package would have it’s own src/ directory to organize some files in. I was wondering if more directories representing modules would be a welcome change? Would such a directory be base/linalg/blas or (similar to sparse) would it be base/blas? Would BLAS and LAPACK share a directory, since LAPACK requires BLAS anyway?

More generally, do people want more modularity in LinAlg? It seems rather entangled at the moment, so it would take quite some effort to make it fully modular. But in general, are baby steps welcome, and how should we go about this?

cc @andreasnoack


#2

I agree with your proposed reorg. BLAS.jl and LinAlg.jl should be separate packages. BLAS.jl should be internally divided into wrappers for the lower-level vendor blas, specializations of the appropriate julia-level functions, and generic/fallback implementations (which could hopefully eventually become fast enough that whether to include a vendor blas is just a configuration option).

It seems to me that all of linalg/matmul.jl should be part of BLAS.jl, as well as the BLAS-like parts of linalg/generic.jl.


#3

As I understand the proposal, the BLAS module should be separate from the generic linear algebra code in Base and provide optimized versions of the existing functions for BlasFloats element types. If a BLAS module includes all matrix multiplication code then it would have to be loaded as one of the first modules in LinAlg. If it is just fast versions of the existing functions then it could be a last and optional module in LinAlg.


#4

BLAS.jl should be internally divided into … and generic/fallback implementations (which could hopefully eventually become fast enough that whether to include a vendor blas is just a configuration option).

It seems to me that all of linalg/matmul.jl should be part of BLAS.jl, as well as the BLAS-like parts of linalg/generic.jl.

Like Andreas said, I was suggesting to have “BLAS” as being wrappers for an external library that makes matrix multiplication faster. Not loading the the BLAS submodule shouldn’t mean generic matrix multiplication is impossible - I don’t see the upside of that.

(For example, it would be interesting to build Julia with less dependencies and still have generic code work, e.g. this. I do realize we don’t (yet) have generic replacements for everything in LAPACK, ARPACK and SuiteSparse, but that’s not a reason not to have a modular system like that common in large systems of packages (MathProgBase is one example, but I’m not that familiar with it)).