Just implemented Tarjan’s SCC (strongly-connected components) algorithm. In the first version, I created the dfs() function (which is recursive) inside the scc() function, and let it capture the needed variables from the body of the scc() function.
Note that in this version I already moved
nowandgroup_numinto a mutable struct, since I need to modify them during the recursion, and using them as plainIntwas much more expensive. The other variables,low,ord,visited,ids, were defined in the outerscc()function.
function dfs(v::Int)
    info.now += 1
    low[v] = info.now
    ord[v] = info.now
    push!(visited, v)
    for to in csr[v]
        if ord[to] == 0
            dfs(to)
            low[v] = min(low[v], low[to])
        else
            low[v] = min(low[v], ord[to])
        end
    end
    if low[v] == ord[v]
        info.group_num += 1
        while true
            u = pop!(visited)
            ord[u] = n
            ids[u] = info.group_num
            if u == v
                break
            end
        end
    end
end
Benchmarking showed that there were a lot of allocations. For a randomly generated graph with 200,000 nodes and 500,000 edges:
julia --project=. main.jl
  26.753 ms (199511 allocations: 19.07 MiB)
And after I moved all the information into the info variable, and made it a parameter of the recursive function:
function dfs(v::Int, info::SCCIterationInfo)
    info.now += 1
    info.low[v] = info.now
    info.ord[v] = info.now
    push!(info.visited, v)
    for to in info.csr[v]
        if info.ord[to] == 0
            dfs(to, info)
            info.low[v] = min(info.low[v], info.low[to])
        else
            info.low[v] = min(info.low[v], info.ord[to])
        end
    end
    if info.low[v] == info.ord[v]
        info.group_num += 1
        while true
            u = pop!(info.visited)
            info.ord[u] = info.n
            info.ids[u] = info.group_num
            if u == v
                break
            end
        end
    end
end
I got ~25% performance improvement.
julia --project=. main.jl
  19.920 ms (15 allocations: 12.97 MiB)