Inefficiency of add_edge! in SimpleWeightedGraphs

I have a master graph with 250,000 nodes, and a variable called homes which is a Vector of Vectors of size 4, with values between 1 and 100,000. The idea is that many vector of size 4 has a corresponding dense graph, which is embeded in the master graph. The master graph can either be a SimpleGraph or a SimpleWeightedGraph. In the former case, the code runs super fast in about 0.15 sec. In the latter case (with SimpleWeightedGraph), the code takes 24 sec. And yet, when there are only a low number of lists of 4 elements, the SimpleWeightGraphs take only about twice as long as the SimpleGraphs. My conclusion: add_edge! for SimpleWeightedGraph becomes less and less efficient as more and more edges are added. If this is not the case, I have no idea why the code becomes so inefficient. Needless to say, I cannot accept this for my use case.

function ccreateHomeGraph(nb_nodes)
    #mgh = SimpleWeightedGraph(nb_nodes)
    mgh = SimpleGraph(nb_nodes)   # <<<< Should be weighted graph?
    homes = rand(1:100000, 100000)
    homes = reshape(homes, div(length(homes), 4), 4)
    homes = [homes[i,:] for i in 1:size(homes,1)]
    @show size(homes)
    #homes = [[1,2,3,4,5] for i in 1:100000]
    weight = 0.3  # need more generality
    for r in 1:length(homes)
        person_ids = homes[r]
        sz = length(person_ids)
        @show r, sz
        for i in 1:sz
            for j in i+1:sz
                #add_edge!(mgh, person_ids[i]+1, person_ids[j]+1, weight)
                add_edge!(mgh, person_ids[i]+1, person_ids[j]+1)
            end
        end
    end
    return mgh
end
# SimpleGraph about 2x speed of Weighted SimpleGraphs
@time graph = ccreateHomeGraph(250000)

Here is the add_edge! function that is called. Does it become more and more inefficient the more edges are added? If so, why? This function is found in simplegraphs.jl:

# add_edge! will overwrite weights.
function add_edge!(g::SimpleWeightedGraph, e::SimpleWeightedGraphEdge)
    T = eltype(g)
    U = weighttype(g)
    s_, d_, w = Tuple(e)

    if w == zero(U)
        @warn "Note: adding edges with a zero weight to this graph type has no effect." maxlog=1 _id=:swg_add_edge_zero
        return false
    end

    s = T(s_)
    d = T(d_)
    (s in vertices(g) && d in vertices(g)) || return false
    @inbounds g.weights[d, s] = w
    @inbounds g.weights[s, d] = w
    return true
end

After reading more threads on this subject, I recoded to work with MetaGraphs, which runs the full problem in about 40 sec, although it took about 3 seconds with SimpleGraphs. So scaling as a function of the number of edges is improved with MetaGraphs; however, the time is substantially more expensive than with SimpleGraphs. Looking at !add_edge() in MetaGraphs.jl, I fell there is lots of room for improvement if the library contained optimized versions of the add_edge! function. For example, in many use cases, mine, in particular, I know that my end nodes are in the vertex set, so there is no need to check. How expensive is the check? Is it an O(1) check? Retrieving a property from each edge also strikes me as expensive. Just keeping a dictionary{Edge, Property} strikes me as far more efficient. Perhaps that is what MetaGraphs does already?

This is correct. SimpleWeightedGraphs use a sparse matrix as a backend data structure, which is not really efficient for large numbers of inserts. SimpleGraphs uses sorted adjacency lists, which provides O(log(deg)) insertion time.

MetaGraphs uses a SimpleGraph as its core structure, so add_edge! should be pretty close to SG’s performance. If you’re seeing a big difference between MetaGraph add_edge! and SimpleGraph add_edge! then a MWE would be appreciated.

Also, for SG, SWG, and MG, v in vertices(g) is O(1).

Tradeoffs, tradeoffs.

1 Like

@anon94023334: I understand. So let us start with something simple. I apologize for the quite long post, but it is the amalgamation of several posts, which I am told is preferable.

@btime SimpleGraph(200000)
@btime SimpleWeightedGraph(200000)
@btime SimpleWeightedGraph(SimpleGraph(200000))

which produces

6.677 ms (200003 allocations: 16.78 MiB)
  688.282 μs (9 allocations: 3.05 MiB)
21.102 ms (400015 allocations: 35.10 MiB)

This is not logical to me. First, SimpleGraph has one allocation per vertex. Why not have one large allocation if the number of vertices is known? Perhaps you are allowing for the adding of vertices? But then why not define a DynamicSimpleGraph type for that case? SimpleWeightedGraph allocates nothing, which I find logical. So why 400,000 allocations when the constructor of SimpleWeightedGraph is a SimpleGraph? I would have expected at most 200,000 allocations, which I would reduce to one large allocation. Finally, the construction time cost of SimpleWeightedGraph is three times that of SimpleGraph, which seems very excessive. Note that I used @btime, so I am getting the best timings.

Continuing …

sg = SimpleGraph(200000)
@time SimpleWeightedGraph(sg)

which indicates that the 400,000 allocations of SimpleWeightedGraph results from allocating the adjacency matrix from SimpleWeightedGraph. One question is: why not use a reference to this adjacency matrix since it has already been constructed? Second question? Why not copy the adjacency matrix rather than have 200,000 allocations. Here is the likely constructor source code:

# SimpleWeightedGraph{T, U}(SimpleGraph)
function (::Type{SimpleWeightedGraph{T, U}})(g::LightGraphs.AbstractGraph)  where {T<:Integer, U <: Real}
    adj_matrix = if LightGraphs.is_directed(g)
        # TODO abstract function instead of SimpleGraph constructor
        adjacency_matrix(LightGraphs.SimpleGraphs.SimpleGraph{T}(g), U)
    else
        adjacency_matrix(g, U)
    end
    return SimpleWeightedGraph{T, U}(adj_matrix)
end

So I broke my calculation down:

@time adj_matrix = adjacency_matrix(sg)
@time SimpleWeightedGraph(adj_matrix)
  0.018799 seconds (200.01 k allocations: 16.785 MiB)
  0.001432 seconds (3 allocations: 1.526 MiB)

So again, why would getting the adj_matrix from sg produce 200,000 allocations. In my estimation, it should produce no allocations whatsoever!! I notice by the way the the adjacency matrix of SimpleGraph is also of type SparseArrays.SparseMatrixCSC{Int64,Int64} so I do not understand why a SimpleGraph is so much more efficient than a SimpleWeightedGraph. Why would add_edge! in a SimpleWeightedGraph not scale with size if it does scale for SimpleGraph given that the adjacency matrix is sparse for both? This lack of scaling has previously been discussed on discourse.julialang.

More surprising, is that I checked that the statement

adj_matrix = adjacency_matrix(sg)

is not a copy. So why the 200,000 allocations? Only the address of the sparse matrix should be copied. I feel that a copy is made for the pointers to each row. Why copy anything? In fact it doesn’t in the following experiment, in which I inserted three values into adj_matrix. So I am not clear why the copy is made when initializing a variable of type SimpleWeightedGraph:

@time aaa = adj_matrix
  0.000000 seconds
200000×200000 SparseArrays.SparseMatrixCSC{Int64,Int64} with 3 stored entries:
  [3  ,   4]  =  27
  [27 ,  58]  =  32
  [135, 235]  =  10

For completion, I dumped the structure of SimpleGraph, SimpleWeightedGraph and MetaGraph:

julia> dump(SimpleGraph)
UnionAll
  var: TypeVar
    name: Symbol T
    lb: Union{}
    ub: Integer <: Real
  body: SimpleGraph{T<:Integer} <: LightGraphs.SimpleGraphs.AbstractSimpleGraph{T<:Integer}
    ne::Int64
    fadjlist::Array{Array{T,1},1}
# --------------------------------------------------------------------
julia> dump(SimpleWeightedGraph)
UnionAll
  var: TypeVar
    name: Symbol T
    lb: Union{}
    ub: Integer <: Real
  body: UnionAll
    var: TypeVar
      name: Symbol U
      lb: Union{}
      ub: Real <: Number
    body: SimpleWeightedGraph{T<:Integer,U<:Real} <: AbstractSimpleWeightedGraph{T<:Integer,U<:Real}
      weights::SparseArrays.SparseMatrixCSC{U,T}
# -----------------------------------------------------------------
 dump(MetaGraph)
UnionAll
  var: TypeVar
    name: Symbol T
    lb: Union{}
    ub: Integer <: Real
  body: UnionAll
    var: TypeVar
      name: Symbol U
      lb: Union{}
      ub: Real <: Number
    body: MetaGraph{T<:Integer,U<:Real} <: AbstractMetaGraph{T<:Integer}
      graph::SimpleGraph{T}
      vprops::Dict{T,Dict{Symbol,Any}}
      eprops::Dict{LightGraphs.SimpleGraphs.SimpleEdge{T},Dict{Symbol,Any}}
      gprops::Dict{Symbol,Any}
      weightfield::Symbol
      defaultweight::U
      metaindex::Dict{Symbol,Dict{Any,Integer}}
      indices::Set{Symbol}

You told me that you store weights separately from the ```adjacency_matrixbecause many people preferred it that way. But that does not explain why the data structure for the weights is not the same as for the adjacency_matrix? Why should havefadjlist::Array{Array{T,1},1}` for the adacency matrix and a sparse array structure for the weight matrix? Both structures are not equally efficient are they?

I think I have overstayed my welcome. :slight_smile:

but it is the amalgamation of several posts, which I am told is preferable.

We’re at the point, I think, where it might be better to use Github issues for this discussion, as it’s probably not of general interest.

You have one allocation per vertex because we’re creating a vector of (different) empty vectors. What would you do otherwise that would be more efficient? Here’s an alternative:


function sg(n::T) where T <: Integer
    fadjlist = Vector{Vector{T}}()
    sizehint!(fadjlist, n)
    @inbounds for _ = one(T):n                                                                                                                  
        push!(fadjlist, Vector{T}())
    end
    return SimpleGraph{T}(0, fadjlist)
end

Which doesn’t perform as well:

julia> @benchmark SimpleGraphsCore.sg(2_000_000)
BenchmarkTools.Trial: 
  memory estimate:  167.85 MiB
  allocs estimate:  2000003
  --------------
  minimum time:     92.165 ms (0.00% GC)
  median time:      105.792 ms (19.77% GC)
  mean time:        116.216 ms (27.17% GC)
  maximum time:     226.732 ms (63.25% GC)
  --------------
  samples:          44
  evals/sample:     1

julia> @benchmark SimpleGraphsCore.SimpleGraph(2_000_000)
BenchmarkTools.Trial: 
  memory estimate:  167.85 MiB
  allocs estimate:  2000003
  --------------
  minimum time:     78.624 ms (0.00% GC)
  median time:      90.569 ms (24.84% GC)
  mean time:        99.989 ms (31.70% GC)
  maximum time:     212.482 ms (67.59% GC)
  --------------
  samples:          50
  evals/sample:     1

Perhaps you are allowing for the adding of vertices?

Yes. Both SimpleGraph and SimpleWeightedGraph allow graph modification (adding/removing of vertices), though each has its own performance profile/tradeoffs.

But then why not define a DynamicSimpleGraph type for that case?

That’s what these graph types are. If you want an unmodifiable graph, use StaticGraphs.

SimpleWeightedGraph allocates nothing, which I find logical.

Sparse matrices don’t allocate anything except for colptr until you start adding to them. An empty sparse matrix is three pointers and 2 integers big, plus m values of some integer size that represents the column type.

So why 400,000 allocations when the constructor of SimpleWeightedGraph is a SimpleGraph ?

Because with SimpleGraphs you’re simply paying the cost of allocation up front instead of repeatedly during edge insertion. Give it a try: create two empty 200k-vertex graphs (SG, and SWG), and then time adding 2 million random edges to each.

I would have expected at most 200,000 allocations

and that’s what you get:

julia> @btime SimpleGraph(200_000)
  4.137 ms (200003 allocations: 16.78 MiB)

Directed graphs have to store the backward adjacencies, so that doubles the allocations:

julia> @btime SimpleDiGraph(200_000)
  8.546 ms (400005 allocations: 33.57 MiB)

What you’re doing in the third case is converting into a sparse-matrix-backed graph type from a adjacency-list-backed graph type. This will necessarily be less efficient than creating it directly.

Finally, the construction time cost of SimpleWeightedGraph is three times that of SimpleGraph

I don’t see this for empty graphs:

julia> @btime SimpleGraph(200_000)
  4.166 ms (200003 allocations: 16.78 MiB)
{200000, 0} undirected simple Int64 graph

julia> @btime SimpleWeightedGraph(200_000)
  287.023 μs (9 allocations: 3.05 MiB)
{200000, 0} undirected simple Int64 graph with Float64 weights

It’s ~20x faster to create an empty SWG than an empty SG. Again, you’re saving time up front at the cost of worse-performing edge inserts, which is why if you’re adding edges one at a time, you’ll get better performance in the long run with SimpleGraphs.

One question is: why not use a reference to this adjacency matrix since it has already been constructed?

I don’t understand this. Where has it been constructed outside of the adjacency_matrix function?

Why not copy the adjacency matrix rather than have 200,000 allocations
So again, why would getting the adj_matrix from sg produce 200,000 allocations.

For SimpleGraphs, there’s no adjacency matrix to copy until we create it, which is where the allocations are coming from. SimpleGraphs do not store a sparse matrix: they store adjacency lists, and we have to convert the adjacency lists to a sparse matrix in order to provide an adjacency matrix. This involves allocation: specifically, nv(g) ints to put into the colptr vector for the sparse matrix.

In my estimation, it should produce no allocations whatsoever!!

You have to allocate nv(g) memory for the colptr vector in every case. Are you looking at the number of allocations as opposed to the size of the allocs? In that case, it’s because we hand-roll the creation of the adjacency matrix in SimpleGraphs for efficiency, and this requires a bunch of small allocations but it’s the fastest way we’ve found to do it for the general (non-empty) case. If you can find a faster way, please open a PR.

More surprising, is that I checked that the statement

adj_matrix = adjacency_matrix(sg)

is not a copy.

It’s not a copy of anything, because it doesn’t exist anywhere until you create it. There is no sparse matrix just hanging around waiting for you to call adjacency_matrix. The sparse matrix is created when you call the function. Because we don’t want modifications to the adjacency matrix to change the graph from which the AM was created, and because we don’t want to store the graph data twice, we return a separate data structure. This necessarily allocates.

So I am not clear why the copy is made when initializing a variable of type SimpleWeightedGraph

There is no copy, as you’re showing in your aaa = adj_matrix benchmark.

You told me that you store weights separately from the adjacency_matrix because many people preferred it that way.

For SimpleGraphs, we do not store weights at all. For SimpleWeightedGraphs, the weights are stored in the sparse matrix that represents the graph. Unsurprisingly, perhaps, this data structure is called weights.

Why should have fadjlist::Array{Array{T,1},1} for the adacency matrix and a sparse array structure for the weight matrix? Both structures are not equally efficient are they?

We don’t have that. You may be confusing the data structure for SimpleGraph, which is an adjacency list: Vector{Vector{T}} and is called fadjlist, with that of SimpleWeightedGraph, which is a sparse matrix that holds weight information as well, and is called weights.

If you’re asking why we require folks who use SimpleGraphs to store their weight data separately, it’s because we have (deliberately) created no place in the data structure used by SimpleGraph (that is, an adjacency list) to store the weights. SimpleGraphs are weightless.

I think I have overstayed my welcome. :slight_smile:

It’s probably best to continue this via Github issues.

Github issues for which package.

LightGraphs is fine: https://github.com/JuliaGraphs/LightGraphs.jl/issues/new

Just to set expectations, this week is pretty busy for me so please be patient – my responses might be delayed.

1 Like

This also might help explain things:

julia> g = SimpleGraph(2_000_000)
{2000000, 0} undirected simple Int64 graph

julia> a = adjacency_matrix(g)
2000000×2000000 SparseArrays.SparseMatrixCSC{Int64,Int64} with 0 stored entries

julia> @btime SimpleWeightedGraph(a)
  4.256 ms (3 allocations: 15.26 MiB)
{2000000, 0} undirected simple Int64 graph with Int64 weights

julia> @btime SimpleWeightedGraph(g)
  123.319 ms (2000012 allocations: 183.11 MiB)
{2000000, 0} undirected simple Int64 graph with Float64 weights

This is because SimpleWeightedGraph(g) needs to construct (and allocate) the adjacency matrix on every call, because it is not saved, while if you pass in a sparse matrix directly, it’s a zero-allocation operation because we’re just using the reference. (Also, note that the weights on a SimpleWeightedGraph default to Float64, while on a sparse matrix, they default to the eltype of the matrix.)

Thank you very much for taking the time to reply to all my questions. I will pursue this if needed on Github as you suggested. :slight_smile:

But quickly: if I have to create many smaller Metagraphs of different sizes to insert into a master Metagraph, it would be nice to create a single sparse adjacency_matrix and reuse it. If this matrix is created for a matrix of size 50_000 x 50_000, knowing that this is the maximum size matrix I will be handling (the full matrix being 200_000 x 200_000), it might be possible to use it for the smaller matrices, IF the graph size is stored in the struct, similarly to ne. That would greatly cut down allocation. I notice that garbage collection sometimes take 50% to 80% of the time because of all the allocations, which add up. See you on Github!