Detect and Return Cycles in a given directed graph

Hi there,
I was wondering if there was a way to detect the number of cycles in a graph. And for each cycle detected, i want to know which nodes are involved in the cycle.

I have attached an example of the simplest form of such graph in the picture below. So from this graph I want to output something like:

number of cycles = 1
dict_cycles[1] = "cycle1" =>  [2,3,4,5]

For a directed graph, there’s simple_cycles().
For undirected graphs, there’s cycle_basis().

The example graph doesn’t look cyclic to me but see Cycles · Graphs.jl for cycle related functions in the Graphs package.

Using simple_cycles() comes up empty.

Perhaps cycle is not the right word for what I’m trying to detect.

@GunnarFarneback

What I’m really trying to detect is: To check if a node’s parents share an ancestor.
So in the example graph above, node 5’s parents are 3 and 4, but 3 and 4 share an ancestor.

To give another example, in the new graph below, node 7’s parent are 5 and 6. But these parents share an ancestor which is node 2.
So there is a cycle between the nodes 2,3,4,5,6,7.

You are correct. There is no cycle between [2,3,5,7,6,3] if we consider the graph to be directed. There is a cycle if the graph is considered to be undirected. Checking if some node share an ancestor would also work in this specific case, but then any node that comes after 7 (e.g., if there was a 7 → 8 → 9) in the graph will also appear as “being in a cycle.” Is this what you want?

No, that would not quite be what i want.
if there was a 7 → 8 → 9) . I would still expect the only “cycle” in the graph to be 2,3,4,5,6,7. as shown below

Searching for every cycles can be very costly, because there can be exponentially many. You said you wanted to find if a node have two parent’s sharing an ancestor. That could be done by a simple DFS starting from each parent of your nodes. If some DFSs reach common vertices, then these are ancestors of at least two parents.

The reason why a DFS wasn’t enough is because a DFS would only tell me if there is a shared ancestor. I want to also know which node that shared ancestor is.

The shared ancestor will be the common node encountered during your search. You can even find the two paths from the ancestor to the two parents by storing the predecessor in the DFS for each node during the search, and at the end, you just have to reconstruct the path. (Or you can just do another DFS, but from the ancestor)

I am not quite sure how to execute this. If you don’t mind, would you be able to show a simple example?

And if there as a 4 -> 10 -> 7? How many cycles are there? one or two? If one, which should be returned? [2,3,5,7,6,4] or [2,3,5,7,10,4]?

function dfs!(g, successors, visited, parent, u)
    for v in inneighbors(g, u)
        if visited[v] == 0 # we never visited this vertex so we mark it as visited and store the predecessors on the path
            visited[v] = parent
            successors[v] = u
            result = dfs!(g, successors, visited, parent, v)
            if result != nothing
                push!(result, v)
                return result
            end
        else
            if visited[v] != parent # we already visited this vertex from another parent, so we stop and return the path
                return [v]
            end
        end
    end
    return nothing
end

g = path_graph(7)
add_vertices!(g, 2)
add_edge!(g, 2, 8)
add_edge!(g, 8, 9)
add_edge!(g, 9, 5)
successors = zeros(Int, nv(g))
visited = zeros(Int, nv(g))
for parent in inneighbors(g, start)
    successors[parent] = start
    path1 = dfs!(g, successors, visited, parent, parent)
    if path1 != nothing
        push!(path1, parent)
        push!(path1, start)
        println(path1)
        # reconstruct the other path
        path2 = [path1[1]]
        while path2[end] != start
            push!(path2, successors[path2[end]])
        end
        println(path2)
        break
    end
end

To be honest i hadn’t yet considered the case of multiple shared ancestors.
Just from visually observing the graph below, I think i would say there are two cycles: [2,3,4,5,6,7] and [2,3,4,5,10,7]

Hi @etienne_dg , first of all thanks for taking the time to provide an example.
Your example code does answer my original question.
However, when I try to test it on a graph with multiple “cycles”, it seems to no longer work

To clarify what i mean let me give two examples:
Say we have a graph like this:

when we run your example to start at node 7 of this graph, the results is:
[2, 4, 6, 7]
[2, 3, 5, 7]
This is exactly the result I was hoping to get.

However say we have a second graph like this


when we run your example, the result is still also:
[2, 4, 6, 7]
[2, 3, 5, 7]

This result is not what I was expecting because of the addition of the path 4 → 10 → 7

I think it is a really good idea to define well what you really mean by cycles.

One question that may help is: considering the original graph is directed, it is a DAG (this is, acyclical for the common definition of cycle)? There can be, for example, a 7 -> 2 edge? Or only a 2 -> 7? If for the common definition of cycles in directed graphs you cannot have a cycle, then maybe what you want are the cycles when the graph is considered to be undirected.

You are right, this is exactly what I want to do. In-fact what i’m working on exclusively uses DAG representation.
So now I’m perhaps thinking a DFS based solution could still work in such a way that:

G=orginal_digraph
Undirected_G = graph(G)

cycles=some_DFS_-based_function(Undirected_G )

To create such DFS_based function that outputs the cycles in Undirected_G , I think I could tweak @etienne_dg 's answer below. I will try this and update the post if successful.

I’m not so sure, I think you don’t want to consider this graph:
1->3, 1->4, 2->3, 2->4
as having a cycle.
I think you want to find the set of pairs of vertex-disjoint s,t-paths.

My code only return one such cycle. I don’t know why you want to find all the cycles because there can be exponentially many, and its probably not what you want. You should probably explain what you want to achieve on a higher level to avoid the XY problem.

You can adapt my code by storing at each node every parent, and for each parent, every successor instead of just a singe one. During the dfs for some parent, say parent p_1, if we encounter for the first time parent p_2, then we have a cycle. If its not the first time, this will not be a simple cycle so we ignore it. It during the path, we already met a vertex explored by every other parent, then we can backtrack as we won’t find any new simple cycle. If you just want to count the number of such cycles (without returning these), I think you can have an efficient algorithm, with some dynamic programming to count the number of cycles instead of a whole search, but it’s a bit long to explain.

Edit: It’s more complex than what I thought, we can’t just ignore a vertex if its not the first time we encounter some parent.

You are right, this directed graph wouldn’t have a cycle and i can see why converting it to a undirected graph would cause issue.

Ultimately I am trying to do is given some directed graph G:
for every node in the G, check if any of it’s parents share a common ancestor, and if they do, tell me which node that ancestor is.

Which is why a DFS based solution seemed like a good idea, i.e. check if there are “cycles” and output nodes involved with those cycles which is where this post started.

From some of my previous work, I have a function to handle the cases where 2 nodes share an ancestor.
So now I’m thinking what if I first convert G to a new graph G_new. Where in G_new, every node can only have at most 2 parents. This is ofcourse a much different question that what I first started of with and I may have to open a new post for doing this conversion if I can’t figure it out.

In you definition it is implicit that the parents of any node can only have one common ancestor. What if there are many? There are at least two different ways that they may have more than a common ancestor.

  1. The first way they may have more than a common ancestor is already in your examples. The node 1 (that only has an edge going to 2) is also a common ancestor of 7. But you never put it on your examples, so I believe what you mean is more like “a shared ancestor that has at least one path to the node in which it is the only shared ancestor”. I am not sure if this is the exact definition of you want, but it seemed the more appropriate one. Note that if we define like “a shared ancestor that has at least two paths to the node that do not pass through any other shared ancestors” (I was going this direction before), then 2 would not qualify, because it has only the 3,5,7 path without ancestors, the other two paths pass through 4 (which it is a shared ancestor itself).
  2. What if there is a node 11 which goes to both 3 and 4? I am not sure if all nodes need to be connected to 1, but if they need then there may be a connection from 1 to 11, no problem. The graph remain a DAG, and now you do have another node that is basically a mirror of node 2 (i.e., it comes from 1, and it goes to both 3 and 4). By any definition of ancestrality that I know of, both 2 and 11 would be shared ancestors, and there is no criteria for one to be an ancestor and for the other to not be.

Are you intentionally omitting [4,6,7,10]? If so, on what grounds?