I have encountered two distinct major performance hits caused by bugs in different versions of 0.6.0 (both reported in earlier threads here and as Julia issues) that in turn were caused by the fact that SparseMatrixCSC{T,U} <: AbstractArray{T} or SparseMatrixCSC{T,U} <: AbstractMatrix{T}. Because of these subtyping relationships, a sparse matrix operation that is not properly dispatched (which happened in my two cases) falls back to a dense matrix implementation, which may be extremely slow.
I would like to propose that these subtype relationships should be dropped from the language. Users of sparse matrix computation are already likely to be concerned about performance and will not want to contend with hidden performance hits because something works unexpectedly. In other words, for sparse matrix users, the loss of convenience that might result from deleting these subtype relationships is far outweighed by the assurance of no silent performance killers. Furthermore, there is a simple workaround if these relations are dropped, namely, convert the sparse matrix to a dense matrix in a preliminary step.
I think the underlying issue here is that there is currently not enough specialized methods for sparse matrices (see https://github.com/JuliaLang/julia/issues/21796 – especially views of sparse matrices are tricky, is my impression). A better solution would be to implement those. I believe most users of sparse matrices would like to have them behave like matrices, and be efficient. Until in place, performance issues of fallback methods should be easy to catch with profiling. But my sense is that a lot of people have effecient sparse matrix implementations lying around for their own use (e.g. https://github.com/EcoJulia/SpatialEcology.jl/blob/master/src/Sparse_matrixfunctions.jl)
To help the engineering effort, I think some tooling work here could be valuable. To a first order, a simple comparison between methodswith(AbstractArray) and methodswith(SparseMatrixCSC) may be helpful.
I’m sympathetic to the pain-point here, but were we to change this I think we’d instead get complaints about all the MethodErrors instead. I think there are better solutions worth exploring first — like https://github.com/timholy/ArrayIteration.jl, for example.
I disagree that it is easy to catch these issues with the profiler. In my own code, there is a minute of set-up time before the main loop begins, and this setup time alone makes use of the profiler difficult. Furthermore, my main loop sometime executes method A and sometimes method B, which makes profiling either A or B individually a challenge. For the bug regarding sparse matrix concatenation, A and B both have sparse-matrix concatenation, but only B was falling back to the dense matrix method. If you still don’t believe me, I will welcome you to help me find other performance issues in my code using the profiler!
However, if there is another run–time tool that can specifically spot cases of sparse matrix computations being carried out via dense matrix fallbacks, I would like to know about it.
I also disagree with the assessment “I believe most users of sparse matrices would like to have them behave like matrices”. Deleting the subtype relationship does not make sparse matrices behave differently from matrices; it only makes certain fallback methods inaccessible.
Your suggestion and mine are not mutually exclusive: I agree with you that more sparse matrix methods should be implemented, and I also believe that the subtype relationships should be deleted.
I don’t follow Matt Bauman’s suggestion about “better solutions” such as ArrayIteration.jl. I looked at this package; its purpose seems to be to simplify the task of writing code that works generically with a variety of matrix-like types. But this is not what I am trying to do-- I am trying to use sparse matrices to achieve high performance in an application. Genericity is not my issue.
Yes, I know. I’m just going backwards one step — in order for you to be able to have access to those high performance implementations, they must first exist.
Making it easier to create generic algorithms that are fast for sparse matrices makes it more likely that those efficient implementations will exist.
It shouldn’t be too hard to create a wrapper type which would only implement the methods specifically designed for sparse matrices (i.e. those returned by methodswith), and thow an error for others (including AbstractArray fallbacks). That should allow you to catch any missing method.
FWIW, I am against this idea, though I appreciate the issue. The fact is that in my package we want to be able to index into a matrix of weights without having to worry whether it’s a sparse or dense representation. AbstractArray works well for both.
If AbstractArray is not extended to sparse matrices, what would its purpose be?
My two cents worth: users of sparse matrices called upon sparse matrices for very specific reasons. If they want a dense matrix, they convert the sparse matrix to a dense matrix EXPLICITLY. In my opinion, IMPLICIT conversion of a sparse matrix to a dense matrix should never be done. It is much more preferable to have the computation fail due to a lack of suitable method rather than to convert to a dense matrix.
Would the following less radical proposal accomplish the desired goal? A new abstract class, say AbstractArrayWithLoops, is defined as an intermediate class between AbstractArray and DenseArray. All the functions in abstractarray.jl and abstractarraymath.jl that have loops in them (copy, vcat, etc) refer to this new type rather than AbstractArray. Then DenseArray, Array, etc., are subtypes of AbstractArrayWithLoops, whereas SparseMatrixCSC, etc, are subtypes of AbstractArray.
This change would not break existing code and yet would prevent sparse matrix methods from accidentally falling back to dense matrix methods in case of more bugs in the Julia base files.
But… why? What’s the benefit? A sparse matrix representation is still a matrix representation. That it doesn’t have the same methods that dense matrices have is a separate issue that can be addressed in some cases (and maybe can’t be in others). As a close analog: in LightGraphs, a SimpleWeightedGraph has a common grandparent type (AbstractGraph) with SimpleGraph, but it’s got a fundamentally different backend structure, and certain operations are suboptimal relative to SimpleGraph (like adding and removing vertices). I don’t think that means that we should have these two graphs have different parent types. It’s still convenient to write generic methods that take AbstractGraph and specialize where it makes sense to do so.
I’m probably missing something here, but separating sparse matrices from AbstractArray strikes me as a particularly bad idea.
Maybe you missed the first post but the argument is, shortly, that inheritance should be used if the types inheriting can share methods defined on the super type. In the case of sparse matrices, almost any fall back to the super type (AbstractArray) will be too inefficient to be usable in practice. Instead of getting a MethodError you get the correct result but at a cost of wrong time complexity. It could be argued that getting a MethodError is more beneficial because it will be up front and you can add a specialized method to fix it, instead of your code running slowly without you noticing.
I am just stating the argument here, not saying whether I personally think it is worth doing.
I saw the first post and figured I misunderstood it because I (strongly) have the opposite opinion: it’s better to have code that is inefficient (but can be specialized) than to have code that doesn’t work at all. MethodError helps nobody in this particular case[1]; but there are some things (like indexing) that are fast enough in both cases to make it worthwhile.
The other thing one could do is redefine all the methods that currently exist for AbstractArray but are not performant for sparse matrices so that they work on DenseArray only. I dislike this approach, but at least it preserves AbstractArray.
[1] except those folks who will implement the functions themselves, which will lead to local implementations, some of them buggy, until someone writes a PR for Base, in which case we’re no worse off than we are today.
If we follow this argument to the end, then maybe no arrays should share a supertype AbstractArray: as for any given array type we should be able to construct a method which is non-performant for that particular array type but good for all other array types.
But I do agree that we should have tooling which makes catching this kind of performance trap easy.
You could save the data after the setup, then profile using that. Profiling is of course not automatic and may require some work (refactoring, writing out test data, etc), but it is often crucial for time-critical code. In general it is unrealistic to expect to skip profiling (and the related work) and rely on the language to save you from suboptimal code.
I am not sure I understand correctly, but are you suggesting that the way to participate in this discussion is to do unpaid work for you?