# Is there a lazy version of graph?

I came across an interesting problem recently that can be solved with the shortest path algorithms from Graphs.jl. The only problem is that it takes a “long time” (julia scale ) to build the graph because there are 250K vertices and ~1mm edges.

In my situation, the existence of these edges can be calculated on the fly. I wonder if it’s possible to build a lazy graph without having to materialize all the edges upfront. Perhaps I just pass a function that will return `true` if there’s an edge between the two vertices. My vertices are just numbered sequentially 1:n.

Does anyone know if Graphs.jl can support that or if there’s another package with such functionality? Thanks

2 Likes

As far as I’m aware, there is still no package for building lazy graphs (and I think it would be an awesome feature). You can totally make a new graph type extending `AbstractGraph`, and use the functions of Graphs.jl.
You just need to implement all the API functions, you can check the documentation to create an alternate graph type. vertices are labelled by a subtype of `Integer`. Despite what claims the documentation, you should have the neighbors functions return vertices in ascending order and the vertices must form a continuous range starting from one (each vertex must be lower than `nv(g)`), otherwise some functions will not work as expected. This is especially true for `astar` implementation, where vertices are directly used for indexing.

You can implement `has_edge`, `outneighbors ` and `inneighbors` lazily.

3 Likes

What is wrong in the docs? Doesn’t it say “ascending order”?

Yes, but it does not speak of continous range starting from one, and ascending order is only for `AbstractSimpleGraph`, so in theory, `AbstractGraph` doesn’t require it, but I’m not sure how well this is respected in the codebase…

Ok, let’s update that

2 Likes

Also, if your performance problem is only in building the graph, how do you construct it? If you add every single edge one by one, this is a very inefficient way to build it, because SimpleGraphs maintain the adjacency lists in ascending order, so it does recompute a lot of things for each edge addition. Edge addition is not constant time for SimpleGraph because of it’s internal structure. You can improve this by batching edge additions, or directly building the `fadjlist` (and `badjlist` for directed graphs) yourself.

Quick update:
It turns out that my performance problem came from populating the sparse array rather than constructing the graph. I use a sparse matrix to keep track of the edge distances, and the `setindex!` calls to the array took the majority of time. That’s actually a little surprising though… I will check further when I have time later.

Thanks for the ideas above and sorry about the noise

Tom

P.S. I create all edges in an array and make a single call to create the graph.

3 Likes

Creating an empty sparse matrix and filling it with `setindex!` isn’t particular fast because the CSC format is optimised for efficient arithmetic operations and not for changes to the sparsity structure.

1 Like