I am trying to implement Cuthill McKee on a number of large, sparse matrices. In particular, I would like to get the mapping from original row order to post-CM (or post-reverseCM) row order.
The only package that I can find that implements Cuthill McKee does this using nodes. In particular, the example linked above wants a Dict for the sparsity pattern (and another dict representing the node-types).
First, is there a cleaner way that anyone knows to do this? And, if not, is there a clean way to output the sparsity pattern as a dict?
As of now, I suspect I will just be writing a lot of for-loops, but I am hoping that there is a more elegant solution. Thank you!
I remember it worked. I don’t know if it is efficient. Of course it was taylored for a specific data structure. I will try to convert it to use CSC sparse matrix this week end.
I do have some nagging doubts about the source algorithm (Zavoianu).
I looked at the Zavoianu version some time ago, and it differed in some ways
from other variants of the RCM I’ve seen in the literature.
I implemented a variant of the version from Carey’s Computational Grids book in FinEtools.
One of my modifications enables renumbering of grids with disconnected components.
Oh, I wasn’t aware of this, thanks for the heads up! Do you know what the suspected error is? The output sparsity looks okay to me. This RCM algorithm was part of my FE toolbox for quite some time and I never had a problem, but perhaps there is a minor issue that I may have missed? I do suspect the matrices were SPD and the graph connected in that part of the code. I don’t have a reason to believe unconnected components present a problem though, I do detect and count them, see the symrcm function docs. The permutation “looks fine” in either case. I updated the README.md with some verification and some images.
Zavoianu’s algorithm calls for a constant maintenance of the queue Q. The child C should add its neighbors immediately, in a sorted order, to Q. The queue would then grow very long, and would also need to be sorted all the time. I don’t do any of that.
Perhaps it would be good to agree on a representation of adjacency and degree and then make a collection of bandwidth reduction algorithms in some central place? It could be called BandwidthReduction.jl or something similar? Then other packages, e.g. FinEtools.jl, LightGraphs.jl, BandedMatrices.jl could simply call upon these algorithms.
I must say that I don’t quite agree with the assertion made in the LightGraphs pull-request discussion that graph permutation algorithms are essentially a preprocessing task and that their performance is therefore a secondary consideration. This may be the case for a graph which remains static throughout a computation. My use case, for example, is the following: I often work with hierarchical grids, e.g. hierarchical simplicial finite elements, sparse grids with wavelet bases, etc. which are continuously refined throughout the computation based on some form of error measure. Depending on the method of integration over these grids, it is often possible to compute integrals (the weights attached to edges in the graph) involving only the additional basis functions (the new vertices in the graph) while leaving the “old” components of the operator unchanged. For this purpose I use some form of a dynamic graph structure for insertion and deletion. In a perfect world, in addition to having a performant renumbering algorithm, I desire some form of “online” updating algorithm for the numbering of the basis functions in the sparse operator representation instead of just appending them to the operator (in the lower right block) and then permuting.
Hi, I just wanted to express interest in this area too. I had previously been using AMD and related algorithms, and recently started wondering about implementing an ordering algorithm too – until I found this thread that is. Not sure yet if CM/RCM will perform well in our case (robotics navigation and SLAM), but worth looking at more closely. We continuously modify and increase the problem and resolve it many times over. Also have special requirements like forcing certain variables in early or late in the resulting ordering, which relates to information sharing and computation recycling. Thanks for posting!
In case you haven’t seen, I actually found this RCM discussion from JuliaFEM first:
I’d recommend using LightGraphs.jl since there is capability to get the adjacency matrix but not yet to extract a variable ordering for fill-in / bandwidth-reduction / pivoting. We have a great interest in this problem as mentioned before although we using the ordering to build a junction / Bayes tree structure from cyclic graphs (other jargon includes elimination / clique / rake-compress / bucket tree).
Options we currently use are:
get AMD ordering directly from standard qr in Julia,
q,r,p = qr(A, Val(true)) where p is the forward ordering (assuming its Householder based),
p = cholfact(A'A,:U,Val(true))[:p] for Cholesky pivot order
the SuiteSparse CCOLAMD varieties, also available through Julia SparseMatrix, (see an example we were given here). FYI, we are not using CCOLAMD as the default.
Either GPL or commercial license (SuiteSparse), currently a little unclear where libamd lands.
(Disclosure, I help develop and maintain packages over at JuliaRobotics and have strong use case for variable ordering in adjacency graphs – most of the related action is in IncrementalInference.jl. Also note, we are not using these methods for matrices as the discussion might suggest, the effort rather is to rebuild computational capacity with graphs as primary modeling language to allow non-Gaussian and non-linear behavior in a more natural form.)
The GPS paper acknowledged that the results of GPS of reducing the bandwidth were not that much of an improvement on RCM, and their main claim was that their algorithm ran so much faster. The curious thing is that I remember a friend telling me that he implemented both RCM and GPS and they ran in approximately the same time. So I think there might have been something wrong with the implementation of the RCM in the GPS paper.
Edit: Sorry, I shouldn’t have referred to bandwidth. Rather it is the profile (fill-in) that is being minimized.