This is more a “request for comments” than an announcement. I’ve just created (but not registered) a package called VectorInterface.jl. Why? The following is a copy of the README. It is long, but I wanted to do a decent job explaining the context and the problem.
Context
Firstly, recall the basic properties of vectors in a given vector space. A vector space is a set of objects, called vectors, with the basic properties that:
- Vectors can be added (which is commutative, there is a neutral element and inverses)
- The neutral element is called the zero vector
- Vectors can be rescaled with a scalar coefficient, taking values in an underlying scalar field. (and there are all kind of relations with vector addition, such as distributivity, …)
Together, these two operations give rise to the construction of linear combinations. - Furthermore, often there is an inner product, or at least a norm.
- Finally, many useful vector spaces admit a finite basis, such that any vector can be written as
a linear combination of a finite number of basis vectors.
Many quantities in science and engineering (and thus in scientific computing) behave as vectors, typically with real or complex numbers as underlying field. Even quantities that don’t (but rather belong to a manifold), their derivatives (tangents and cotangents) do, which is important for automatic differentiation.
More importantly, many algorithms can be formulated using the basic operations (linear combinations and inner products), in combination with some recipe for spitting out new vectors (e.g. applying a linear map): gradient optimization, ODE integrators, Krylov methods, …
Current situation and problems in Julia
The most elementary Julia type that acts as a vector is of course Vector{T<:Number}
, or in fact (almost *) any subtype of AbstractArray{T<:Number}
. However, many other types which are not subtypes of AbstractArray
also are vectors conceptually, e.g. types for representing functions (ApproxFun ecoystem), Tangent
types in the AD ecosystem, and many more.
However, I have several gripes with the Julia interface (or lack thereof) to access the basic vector operations, which makes writing generic algorithms hard. In particular:
-
Vector addition and vector rescaling almost always works simply as
v + w
andv * α
. However, for efficiency, we often want to be able to do these operations in place. For instances ofAbstractArray
, we then need to beusing LinearAlgebra
(which also includes a lot of stuff that we might not need). For vector rescaling within theAbstractArray
family, we can usermul!(v, α)
orlmul!(α, v)
ormul!(w, v, α)
, but that interface is conflated with the concept of matrix multiplication etc, whereas these are two very different concepts. There used to be ascale!
method (in analogy with a corresponding method in BLAS level 1) in older Julia 0.x versions (though it was also not perfect, and was also used for matrix times diagonal matrix). For vector addition on the other hand, the options for in-place methods are the crypticaxpy!
andaxpby!
methods (referring to their BLAS level 1 analogues), with thus a very un-Julian interface (the vector that is modified is the final rather than the first argument). -
In programming, rather than the scalar field (reals or complex), we of course want to know the specific scalar type, with which the vectors can be natively rescaled. For an instance
v
ofAbstractArray{T<:Number}
, this is the typeT
and it can be obtained aseltype(v)
. However, becauseeltype
is also used by the iteration interface, other types which might have an iteration behaviour that is distinct from their vector-like behaviour, cannot overloadeltype
for both purposes. An example fromBase
would be a nested array, e.g. the typeVector{Vector{T<:Number}}
still constitutes a set of vectors with scalar typeT
, buteltype
equal toVector{T}
-
To get the zero vector associated with some vector
v
, we can usezero(v)
, which is fine as an interface, as it is defined to be the neutral element with respect to+
. However,zero
on e.g. a nested array will fail because of how it is implemented. Furthermore, to make a zero vector in place, you could usefill!(v, 0)
forv::AbstractArray{T<:Number}
, but that is a veryAbstractArray
specific interface. The only more general solution is to restort to scaling by zero, e.g. usingrmul!(v, 0)
, if available. -
Closely related to the previous two points, we often want to be able to create equivalent vectors but with a modified scalar type. For vectors in
AbstractArray{T<:Number}
, there issimilar(v, T′)
for this, but that is a very array-specific method, and again fails for e.g. nested arrays. -
Most (but not all) vector-like objects in programming belong to a finite-dimensional vector space. For
v::AbstractArray{T<:Number}
, this dimension is given bylength(v)
, but again this interface is also used for the iteration length, and so new types might face an incompatibility as witheltype
. And for structured arrays,length(v)
might also not express the vector space dimension, for e.g.UpperTriangular{T<:Number}
, the natural vector space dimension isn*(n+1)/2
, notn*n
. -
The inner product and norm corresponds to the Julia methods
LinearAlgebra.dot
andLinearAlgebra.norm
. Unlike in some of the previous points,dot
andnorm
natively support nested arrays. However,dot
is so loose in its implementation, that it also happily computes an inner product between things which are probably not vectors from the same vector space, such asdot( (1, 2, [3, 4]), [[1], (2,), (3,4)])
. In particular,dot
andnorm
alsoaccepts tuples, whereas tuples don’t behave as vectors with respect to the previous methods (+
,*
,zero
).
In summary, the main problem is that there actually is no formal standardized vector interface in Julia, despite its broad potential applicability for writing very generic algorithms.
Existing solutions
Different ecosystems have responded to this hiatus in different ways. Several Krylov and optimization packages merely restrict their applicability to instances of (Abstract)Array{T<:Number}
or even simply Vector{T<:Number}
(like their Fortran and C analogues would probably do). The DifferentialEquations.jl ecosystem does more or less the same, restricting to AbstractArray
(if I remember correctly), but provides a bunch of packages such as RecursiveArrayTools.jl
and ArrayInterface.jl
to accommodate for more complex use cases. Finally, the AD ecosystem (Zygote.jl and ChainRules.jl) use custom Tangent
types for which they define the necessary operations, using a lot of internal machinery to destructure custom types.
Forcing everything to be a subtype of AbstractArray
is an unsatisfactory solution. Some vector like objects might not be naturally represented with respect to a basis, and thus have no notion of indexing, and might not even be finite-dimensional. The AbstractArray
interface is and should be distinct from a general vector (space) interface.
New solution
With VectorInterface.jl, I have tried to create a simple package to resolve my gripes. As I hope that I am not alone with those, I would like this to be useful for the community and could eventually evolve into a standardized interface. Therefore, I would very much value comments. Everything is up for bikeshedding. I tried to come up with a design which is compatible with LinearAlgebra
(e.g. not stealing names) and does not commit type piracy. Currently, VectorInterface.jl provides the following minimalistic interface:
-
scalartype(v)
: accesses the native scalar typeT<:Number
of a vector-like object; also works in the type domain (i.e.scalartype(typeof(v))
). -
zerovector(v)
,zerovector!(v)
andzerovector!!(v)
: produce a zero vector of the same type asv
; the second method tries to do this in-place for mutable types, whereas the third method is inspired by BangBang.jl and tries to do it in-place when possible, and out of place otherwise.Ideally, the
zerovector
functionality would be provided byBase.zero
. -
zerovector(v, S<:Number)
creates a zero vector of similar type, but with a modified scalar type that is now given byS
. In fact, alsozerovector!(v, S)
andzerovector!!(v, S)
work, but for the former,S = scalartype(v)
is the only sensible choice.Given that there is a tendency to zero out uninitialized memory, I think it is fine to merge the concept of constructing a new vector with different scalar type with that of constructing the zero vector.
-
scale(v, α)
,scale!(v, α)
andscale!!(v, α)
rescale the vectorv
with the scalar coefficientα
. The second method tries to do this in place, but will fail ifα
cannot be converted toscalartype(v)
(or ifv
contains immutable contents), whereas the third method is the BangBang-style unified solution. There is alsoscale!(w, v, α)
andscale!!(w, v, α)
to rescalev
out of place, but storing the result inw
. -
add(w, [β = 1,] v [, α = 1])
,add!(w, [β,] v [, α])
andadd!!(w, [β,] v [, α])
computew * β + v * α
, where (by now self-explanatory) the second method stores the result inw
(and errors if not possible), and the third method tries to store inw
but doesn’t error. -
inner(v, w)
works almost equivalent todot(v, w)
, is sometimes a bit more efficient, but also more strict in what arguments it allows. -
norm(v)
simply reexportsLinearAlgebra.norm
These methods are implemented for instances v
of type <:Number
(scalars are also vectors over themselves) and <:AbstractArray
(both <:AbstractArray{<:Number}
and nested array).
In addition, the interface is currently also defined for tuples (but see below), again with arbitrary nesting. So instances of e.g. Vector{NTuple{3,Matrix{Float64}}}
are currently also supported.
Furthermore, in-place methods will work recursively so as to try to maximally work in place. What I mean by that is, if you have nested vectors, say v = [[1., 2.], [3., 4.]]
, then rmul!(v, 0.5)
will keep the outer array, but will work on the inner arrays using regular *
multiplication, and will thus allocate two new inner arrays in this example. scale!(v, 0.5)
would work in-place on the inner arrays and be free of allocations.
Similarly, for v = ((1., 2.), [3., 4.])
, scale!!(v, 0.5)
could of course not work in-place on the outer tuple or inner tuple, but would still work in-place on the inner array. Hence, the return value of scale!!(v, 0.5)
is of course ((0.5, 1.), [1.5, 2.])
, but after his operation, v
would be ((1., 2.), [1.5, 2.])
.
However, I have various questions about which I have not yet made up my mind:
- Should tuples actually be supported? In Base, they are not treated as vectors, do they are supported by
dot
andnorm
. - Should there be some
vectordim
function (name up to debate) to probe the vector space dimension? - Should I also export
dot
? Or shouldinner
just becomedot
and should I not care about its looseness? - Should I have fallbacks in place for user types that did already implement
rmul!
,mul!
,axpy!
,axpby!
anddot
(the latter relating to the previous question)?
TODO: Tests, more documentation
All thoughts and comments are very welcome.
(*) There is one (actually two) subtype of AbstractArray
in LinearAlgebra
that does not behave as a vector, in the sense that its instances cannot represent the zero vector or cannot be rescaled without changing its type, namely UnitUpperTriangular
and UnitLowerTriangular
. The fixed unit diagonal prevents these types from constituting a vector space. It seems like the unit diagonal also poses issues for broadcasting, as operations that are preserving ones are much more rare than operations preserving zeros (which is necessary for any structured or unstructured sparseness).