Implementing Cuthill-McKee

I am trying to implement Cuthill McKee on a number of large, sparse matrices. In particular, I would like to get the mapping from original row order to post-CM (or post-reverseCM) row order.

The only package that I can find that implements Cuthill McKee does this using nodes. In particular, the example linked above wants a Dict for the sparsity pattern (and another dict representing the node-types).

First, is there a cleaner way that anyone knows to do this? And, if not, is there a clean way to output the sparsity pattern as a dict?

As of now, I suspect I will just be writing a lot of for-loops, but I am hoping that there is a more elegant solution. Thank you!

@dlfivefifty have you done this in ApproxFun yet?

I implemented it a few years ago. Functio rcm at

I remember it worked. I don’t know if it is efficient. Of course it was taylored for a specific data structure. I will try to convert it to use CSC sparse matrix this week end.

No I haven’t. Would be nice to have in BandedMatrices.jl

Thanks everyone. I found an acceptable intermediate solution for my particular problem that avoids the need for CM/RCM.

@Paulo_Jabardo, if you do ever convert your code to work with CSC Sparse Matrices, please do let me know.

@dlfivefifty, if I end up writing my own reasonably efficient code for RCM, I’ll try to contribute it to BandedMatrices.jl.


If someone is still interested in this, I had it lying around so I made a quick package. Probably quite rough, so anyone is welcome to make it prettier. See:


Thanks for sharing Robert.

I do have some nagging doubts about the source algorithm (Zavoianu).
I looked at the Zavoianu version some time ago, and it differed in some ways
from other variants of the RCM I’ve seen in the literature.

I implemented a variant of the version from Carey’s Computational Grids book in FinEtools.


One of my modifications enables renumbering of grids with disconnected components.

Disclaimer: I haven’t optimized the speed for larger matrices yet.
For smaller matrices it runs okay.

1 Like

Oh, I wasn’t aware of this, thanks for the heads up! Do you know what the suspected error is? The output sparsity looks okay to me. This RCM algorithm was part of my FE toolbox for quite some time and I never had a problem, but perhaps there is a minor issue that I may have missed? I do suspect the matrices were SPD and the graph connected in that part of the code. I don’t have a reason to believe unconnected components present a problem though, I do detect and count them, see the symrcm function docs. The permutation “looks fine” in either case. I updated the with some verification and some images.

1 Like

Zavoianu’s algorithm calls for a constant maintenance of the queue Q. The child C should add its neighbors immediately, in a sorted order, to Q. The queue would then grow very long, and would also need to be sorted all the time. I don’t do any of that.

What a coincidence, LightGraphs is getting that algorithm too;


That is pretty cool. That would have saved me some time.
By the way, where is it, I can’t find it. Did you rename it?

Should be in that pull request, that I linked (it is not merged yet)

1 Like

Perhaps it would be good to agree on a representation of adjacency and degree and then make a collection of bandwidth reduction algorithms in some central place? It could be called BandwidthReduction.jl or something similar? Then other packages, e.g. FinEtools.jl, LightGraphs.jl, BandedMatrices.jl could simply call upon these algorithms.

I must say that I don’t quite agree with the assertion made in the LightGraphs pull-request discussion that graph permutation algorithms are essentially a preprocessing task and that their performance is therefore a secondary consideration. This may be the case for a graph which remains static throughout a computation. My use case, for example, is the following: I often work with hierarchical grids, e.g. hierarchical simplicial finite elements, sparse grids with wavelet bases, etc. which are continuously refined throughout the computation based on some form of error measure. Depending on the method of integration over these grids, it is often possible to compute integrals (the weights attached to edges in the graph) involving only the additional basis functions (the new vertices in the graph) while leaving the “old” components of the operator unchanged. For this purpose I use some form of a dynamic graph structure for insertion and deletion. In a perfect world, in addition to having a performant renumbering algorithm, I desire some form of “online” updating algorithm for the numbering of the basis functions in the sparse operator representation instead of just appending them to the operator (in the lower right block) and then permuting.


Hi, I just wanted to express interest in this area too. I had previously been using AMD and related algorithms, and recently started wondering about implementing an ordering algorithm too – until I found this thread that is. Not sure yet if CM/RCM will perform well in our case (robotics navigation and SLAM), but worth looking at more closely. We continuously modify and increase the problem and resolve it many times over. Also have special requirements like forcing certain variables in early or late in the resulting ordering, which relates to information sharing and computation recycling. Thanks for posting!

In case you haven’t seen, I actually found this RCM discussion from JuliaFEM first:


Which Julia package would you recommend for the Cuthill-McKee algorithm? I couldn’t find it in LightGraphs.jl

Shameless plug: FinEtools.jl has it. Copy and paste, if you must. Or use with FinEtools.jl.



Thanks @PetrKryslUCSD, that’s a great resource!

I’d recommend using LightGraphs.jl since there is capability to get the adjacency matrix but not yet to extract a variable ordering for fill-in / bandwidth-reduction / pivoting. We have a great interest in this problem as mentioned before although we using the ordering to build a junction / Bayes tree structure from cyclic graphs (other jargon includes elimination / clique / rake-compress / bucket tree).

Options we currently use are:

  • get AMD ordering directly from standard qr in Julia,
    • q,r,p = qr(A, Val(true)) where p is the forward ordering (assuming its Householder based),
    • p = cholfact(A'A,:U,Val(true))[:p] for Cholesky pivot order
  • the SuiteSparse CCOLAMD varieties, also available through Julia SparseMatrix, (see an example we were given here). FYI, we are not using CCOLAMD as the default.
    • Either GPL or commercial license (SuiteSparse), currently a little unclear where libamd lands.
  • AMD.jl, [external C dependency, license unclear]

Currently seems like implementing the 1976 G.P.S. algorithm in LightGraphs.jl might be the highest value path:

We are accumulating a running list of variable ordering algorithm resources in a wiki here, if anyone is interested

(Disclosure, I help develop and maintain packages over at JuliaRobotics and have strong use case for variable ordering in adjacency graphs – most of the related action is in IncrementalInference.jl. Also note, we are not using these methods for matrices as the discussion might suggest, the effort rather is to rebuild computational capacity with graphs as primary modeling language to allow non-Gaussian and non-linear behavior in a more natural form.)


The GPS paper acknowledged that the results of GPS of reducing the bandwidth were not that much of an improvement on RCM, and their main claim was that their algorithm ran so much faster. The curious thing is that I remember a friend telling me that he implemented both RCM and GPS and they ran in approximately the same time. So I think there might have been something wrong with the implementation of the RCM in the GPS paper.

Edit: Sorry, I shouldn’t have referred to bandwidth. Rather it is the profile (fill-in) that is being minimized.


In what sense is the license unclear?

1 Like