# Find k-cliques in a graph

I have an undirected graph, given for the sake of simplicity as a list of edges `(n,m)`, where `n` and `m` are integers corresponding to nodes. I need to find all possible k-cliques (groups of `k` nodes wherein all members are connected to all other members). I see that in LightGraphs.jl there is an algorithm (inherited, I believe, from Graphs.jl, which inherited it from the NetworkX package) for finding all maximal cliques (i.e. k-cliques with a maximum k). But that is a much more difficult problem (NP complete, I think).

The question: is there in the Julia ecosystem a package/algorithm for finding only the k-cliques I need, and that is hopefully more efficient than first computing all maximal cliques? If not, what is the best known algorithm to do that?

(For context: I need this to find all k-simplices in a discretized Brillouin Zone mesh, to compute electronic structure of a generic system in the github.com/pablosanjose/Elsa.jl package)

May or may not be state of the art: https://people.csail.mit.edu/virgi/combclique-ipl-g.pdf

1 Like

Many thanks @mohamed82008, that seems just what I need! Iâ€™m reading it nowâ€¦

What is the kind of size and connectivity of your graph, and what `k`s do you need?

How many k-cliques do you expect to find? Algorithm choice massively depends on these. Iâ€™m not familiar with Brillouin Zone meshes, but it sounds grid-like.

Iâ€™d guess: Number of vertices is N, each has ~C1 neighbors, and given two neighboring vertices p and q, there are ~C2 shared neighbors, and the induced subgraph on the set of neighbors is dense (has ~C2^2 edges). Now youâ€™re looking for k-cliques, and want good scaling in k, C1, C2, and linear scaling in N. If you expect many k-cliques, then your quest is for an algorithm that has low overhead per discovered clique.

My recommendation for that kind of problem would be to iterate over edges, and then put the induced subgraph into a bitmap (this already iterates over 4-cliques).

A specialized implementation that respects your graph will always beat general-purpose algorithms and implementations.

Hi @foobar_lv2,

What is the kind of size and connectivity of your graph, and what `k` s do you need?

The mesh is an arbitrary Delaunay triangulation of a finite region of D-dimensional flat space.
D is not simply 1, 2 and 3â€¦ It can be up to say 6. The value of k of my desired k-cliques is equal to D. The connectivity of the graph is of order 3D, approx. I do expect many k-cliques (because of the Delaunay property)

Iâ€™ll have to think a bit more about the rest of your message. I think youâ€™re hinting to the kind of algorithm Iâ€™m actually implementing, which basically scans neighborhood of neighborhoods for each vertex, and filter out k-cliques from that, but Iâ€™m still working on the details. I want to check whether the paper linked by @mohamed82008 has any clever trickâ€¦

My recommendation for that kind of problem would be to iterate over edges, and then put the induced subgraph into a bitmap (this already iterates over 4-cliques).

Could you please provide a little bit more detail or a reference? What is a bitmap in this context?

1-cliques: Vertices. You already have this.
2-cliques: Edges. You already have this.
3-cliques: Triangles. For every edge, compute the intersecion of the neighborhoods of both vertices. Both are sorted arrays, so this is fast (you need a specialized function for this!).
4-cliques: Let `link(p,q)` be the intersection computed for the 3-cliques. Now, for each z in `link(p,q)` you intersect the neigborhood of z with the link (again, intersection of two sorted lists of integers)
5-cliques: You store all the 4-cliques from the last step into a bitmap. In other words, you compute the induced subgraph on `link(p,q)`. Now you are interested in triangles in this small graph; and the graph is already stored as a bitmap. Iterate over edges, take the intersection of neighborhoods (which is probably a single bitwise AND, if your links have <= 64 vertices; depending on your SIMD width it is almost surely a single vector bitwise AND instruction).
6-cliques: Repeat obvious recursive algorithm.
7-cliques: Repeat obvious recursive algorithm.

You can take code from https://github.com/chethega/BitmapGraphs for bitmapped graphs. This has most of the required pieces (you may need to adapt some things). It already contains the code for constructing a bitmap representation of the link of an edge in a lightgraph, so you get 3-cliques and 4-cliques without writing anything yourself.

I strongly recommend to use the `SBMGraph{N}` variant. This induces a single dynamic dispatch for each edge, but then the 5-7 cliques are allocation free, all loops unrolled, etc. I should probably make this a real package at some time, and make the types explicit `VecElement` to help the compiler figure out that it should SIMD, but you can start with this.

PS. Feel free to join the #graphs channel on slack, open issues on my non-package, or just take code and run with it.

PPS. Bitmap means in this context a datastructure for a graph where every edge is a single bit (1 if edge is present, 0 if not present), and you have aligned rows, i.e. `neighbors(g, i)` always starts at a 64-bit boundary (i.e. an UInt64). This basically means that you round up the number of vertices to a multiple of 64, and then use a `BitMatrix`. It is very helpful to store the number of `UInt64` that represent `neighbors(g, i)` in the type, which is what `SBMGraph{N}` does. That way, all loops over the chunks are fully unrolled and typically a single instruction (for small `N`, which is what you care about). Relevant not-super-trivial parts are only a bit of pointer arithmetic (in order to quickly manipulate the underlying tuple types) and the fast iteration over bits in a tuple (same code as Base logical indexing by bitarray, i.e. tzcnt and blsr instructions), as well as API design (my API kinda sucks).

5 Likes

Awesome @foobar_lv2! This is truly useful, Iâ€™ll look into it in detail, particularly your implementation, which Iâ€™m sure is top notch. Iâ€™ve just joined #graphs in slack too.

Thanks!!

A clarification: I was surprised you didnâ€™t simply write

5-cliques: Repeat obvious recursive algorithm.

You suggest using a bitmap instead. This is simply an optimisation, yes? I would be satisfied to do a generic recursion following your 3,4 description to arbitrary k. Is the added code complexity of the bitmap approach really worth it?

EDIT: My graphs are very sparse, by the way, and bitmaps are dense IIUC

If the computation is fast enough for you without the bitmap, you can stay with that.

There are faster general algorithms for enumerating k-cliques than the â€śobvious recursionâ€ť, like the paper @mohamed82008 linked. The change of storage format uses two very important assumptions: (1) CPU are really good at SIMD and really bad at branching (`if cond ...`). Doing 64 entries at a time (thatâ€™s just an UInt64; even more when using vector registers) in a mostly branchfree matter gives massive speedup on real hardware. (2) This two-tiered approach only makes sense when you assume that your original graph is sparse, but `link(p, q)` is dense.

The bitmap approach sacrifices clever algorithms (like the better complexity class in the cited paper, i.e. a polylog factor) for exploiting these two properties. It really is the dumbest possible algorithm, but respects your silicon and the statistics of your graph. If its assumptions are satisfied (and they should be, for delaunay) then this will almost surely beat much smarter general-purpose algorithms for enumerating k-cliques, and these will in turn beat dumb general-purpose algorithms (like the recursion without bitmaps) that donâ€™t make use of these structures.

That being said, it is probably possible to go much faster by more specialized algorithms that respect the geometry of your problem. These algorithms would not take a graph as input, but instead your point set, and compute both k-cliques and the delaunay triangulation at the same time. What I proposed walks some middle ground between general-purpose and highly specialized.

The point of using the bitmap is that all the intersections of sets you need to compute in the later steps, starting with 5-cliques, typically take a single instruction (thatâ€™s a fraction of a CPU cycle).

1 Like

Iâ€™m not sure about the optimality of the implementation, but `vietorisrips` from https://github.com/piever/PersistentCohomology.jl (a bit of an experimental package) gives you the cliques for `k=1,..,n` (they are called cochains there, because of some reasons that may be a bit obscure if one doesnâ€™t come from an algebraic topology background). It actually also gives the time at which they are â€śbornâ€ť (each non-zero entry of the sparse matrix also has a value and the return type of the algorithm also tells you the smallest number `x` for which the cochain exists for using entries smaller or equal than `x`): again, this is relevant for algebraic topology applications, you can probably just ignore it. The core of the recursive algorithm is here .

2 Likes

Nice!

We should try making that work with https://github.com/bbrehm/Sparips.jl , especially since you natively support sparse matrices. For that, I guess that I should make a proper package that only includes computation of the sparse complex (point cloud, distance function and desired precision -> sparse matrix and aposteriori precision), and another small package that just wraps my fork of ripser (sparse matrix -> persistence diagram) as well as a package that does my plotting (persistence diagram and aposteriori precision -> approximate plot). You would then provide a second solver backend to me (more flexibility, less dependency management), and I would provide a sparsifying pre-processor to you (more speed at cost of some user-defined precision).

As you can guess from the bitmap-graphs, I am in the process of developing a new solver; but I donâ€™t really see any significant parts I can repurpose from anywhere else.

@pablosanjose You should probably try @pieverâ€™s package first: If it happens to be fast enough for you, then you donâ€™t need to chase unneeded performance.

1 Like

Yeah, it seems easier to understand and adapt to my needs (thanks @piever!). At the moment I just need to get a working solution, performance will be tackled later. But if the bitmap approach could be fleshed out for general use in the future, that would be so nice! By the way, is it necessary to have the full LightGraphs as a dependency for that, or could it perhaps be made into a slimmer package?

Hm! I only now realised the importance of this comment. Indeed, as I construct my Delaunay triangulation I get all my k-cliques in the process, duh! Even better, if I have a graph with constant local connectivity (as often happens in my application), finding its k-cliques is O(n), actually much faster and easier than using a general k-clique algorithm. I simply got fixated on the latter because of the appeal of the problem. I often do that :-).

1 Like

Sounds good to me! I actually pretty much only support sparse matrices as input so the two packages go well together. Please feel free to open an issue over at PersistentCohomology if there is some interface the object I return should respect (the persistent cocycles), at the moment I just return an array where the i-th element stores `i-1`-cochains.

Yep, you seemed to have some very specific know how on the various techniques, looking forward to trying out the new solver.