Improving performance of recursions by avoiding closures

Just implemented Tarjan’s SCC (strongly-connected components) algorithm. In the first version, I created the dfs() function (which is recursive) inside the scc() function, and let it capture the needed variables from the body of the scc() function.

Note that in this version I already moved now and group_num into a mutable struct, since I need to modify them during the recursion, and using them as plain Int was much more expensive. The other variables, low, ord, visited, ids, were defined in the outer scc() function.

function dfs(v::Int)
    info.now += 1
    low[v] = info.now
    ord[v] = info.now
    push!(visited, v)

    for to in csr[v]
        if ord[to] == 0
            dfs(to)
            low[v] = min(low[v], low[to])
        else
            low[v] = min(low[v], ord[to])
        end
    end

    if low[v] == ord[v]
        info.group_num += 1

        while true
            u = pop!(visited)
            ord[u] = n
            ids[u] = info.group_num
            if u == v
                break
            end
        end
    end
end

Benchmarking showed that there were a lot of allocations. For a randomly generated graph with 200,000 nodes and 500,000 edges:

julia --project=. main.jl
  26.753 ms (199511 allocations: 19.07 MiB)

And after I moved all the information into the info variable, and made it a parameter of the recursive function:

function dfs(v::Int, info::SCCIterationInfo)
    info.now += 1
    info.low[v] = info.now
    info.ord[v] = info.now
    push!(info.visited, v)

    for to in info.csr[v]
        if info.ord[to] == 0
            dfs(to, info)
            info.low[v] = min(info.low[v], info.low[to])
        else
            info.low[v] = min(info.low[v], info.ord[to])
        end
    end

    if info.low[v] == info.ord[v]
        info.group_num += 1

        while true
            u = pop!(info.visited)
            info.ord[u] = info.n
            info.ids[u] = info.group_num
            if u == v
                break
            end
        end
    end
end

I got ~25% performance improvement.

julia --project=. main.jl
  19.920 ms (15 allocations: 12.97 MiB)
2 Likes