[ANN] GraphNets.jl - Simple, blazing fast, message-passing graph neural network

Hello all!

Excited to release a new package:

GraphNets.jl is a Julia implementation of DeepMind’s Graph Nets and an implementation of the paper Relational inductive biases, deep learning, and graph networks.

This YouTube presentation by Petar Veličković may be of interest as well. It’s a fantastic intro to GNNs and “Graph Nets” in particular:

Also, here’s a toy project demonstrating the training of a GNN using GraphNets.jl:

https://juliamltools.github.io/graph-sort

I hope you find it useful! Feedback is much appreciated.

14 Likes

Thank you for the contribution!!!
Could you elaborate a little on what makes it different from existing packages for GNNs? My search yielded two, but there may be others I don’t know about:

5 Likes

Hi Guillaume! GraphNets.jl is an implementation of a particular GNN architecture, Graphs Nets, which is a message passing variety of GNN that allows input/output feature vectors of nodes, edges, and at the graph/aggregate level. As GraphNets.jl is not a tool that supports other GNN flavors (convolutional, attention, and other message passing varieties), specialization can be applied that allows an optimized implementation of only the “Graph Nets” architecture as outlined here.

In particular, I think “GraphNets.jl” has a novel approach of batching together graphs of different structure (different adjacency matrices), that allows optimal training speed of mini-batches and also allows all matrix multiplications to take advantage of cuBLAS Strided Batched Matrix Multiply (which basically means using NNlib.batched_mul).

It looks like GraphNeuralNetworks.jl offers GNN implementations primary of the convolutional and attentional type. It does seem there is a message-passing variant, but I don’t see that edge vectors are allowed as input or output (I see scalar edge weights, but no support for full edge feature vectors).

GeometricFlux.jl seems to offer a generalized GNN/Geometric interface, but without hyper-focused streamlining of Graph Nets. It seems optimized for a generic GNN interface, but with disparate graph batching as an option and not a core focus. I could be wrong about the overall performance there though as I haven’t yet done an apples-to-apples comparison! Beyond that, I just found the API to be complex. In that regard, perhaps GraphNets.jl is a competitor to GeometrixFlux.jl in the narrow scope of “Graph Nets” (message passing with edge/node/graph level features).

1 Like

Very nice work.

I think that parts of the text of this announcement and the related package comments should be at the beginning of your README.md :wink:

  • How large a graph will be too large for this package and implementation?
  • Can it interoperate with Graphs.jl to try to re-invent classic graph algorithms?
  • I did not see any dependence on SparseArrays.jl. Is it possible for this package to be extended to work with sparse matrices, or is the algorithm dependent on all-to-all operations and sparse matrix representation will be meaningless?
1 Like

Hi Nikos!

  • How large a graph will be too large for this package and implementation?
    Good question! It depends on the the size of your GPU memory and aspects of your graphs’ edges (count, feature dimension), essentially. A rough approximation is ED (edge vector dimension size) * NE (number of edges per graph) * B (batch size or number of graphs) * 4 (4 bytes for every Float32 edge feature component… so 7 Float32s for a 7-dimensional edge feature vector) * NL (number of GNN layers… 4 layers for encoder->core->core->decoder)… this is the number of bytes for just the neural net input to be stored in GPU RAM throughout a training step (also add in the sizes of the matrices for transforming the edge dimensions throughout the neural network). A similar calculation for the node-feature tensors, but the aggregate edge-feature tensors will likely be much bigger. Short answer to your question: you’ll find the limit empirically as your increase the number of edges in your graph, number of graphs in mini batch, and size of the edge vectors.

  • Can it interoperate with Graphs.jl to try to re-invent classic graph algorithms?
    GNNs can certainly re-invent classic graph algorithms. Right now, the only example I have is of sorting the graph nodes by using binary outputs of edge/node features indicating a “sorted” path through the graph. I plan do do another toy example of traveling salesman “shortest path”. All of this is independent of the Graphs.jl package, though. This presentation may be of interest to you: Petar Veličković - Reasoning Algorithmically: from Toy Experiments to AGI Modules - IPAM at UCLA - YouTube

  • I did not see any dependence on SparseArrays.jl
    SparseArrays are only used, optionally, to the extent that the user specifies the adjacency matrix in a sparse format. However, dense matrices of Float32s are used for all batch matrix multiplications inside the neural network blocks.

Hi,
I’m the author of GraphNeuralNetworks.jl (GNN.jl). I wonder why rolling out a new and very limited gnn package instead of contributing to existing ones and benefitting the community. If we want to bring the julia ML ecosystem, and specifically the part related to gnns, up to the standards of python alternatives we have to join forces.

It looks like GraphNeuralNetworks.jl offers GNN.jl implementations primary of the convolutional and attentional type. It does seem there is a message-passing variant, but I don’t see that edge vectors are allowed as input or output (I see scalar edge weights, but no support for full edge feature vectors).

Actually many layers in GNN.jl support scalar and vectors edge features, see the table in the docs. Also, GNN.jl allows to easily express the message passing layer you implemented.

GraphNets.jl is an implementation of a particular GNN architecture, Graphs Nets, which is a message passing variety of GNN that allows input/output feature vectors of nodes, edges, and at the graph/aggregate level. As GraphNets.jl is not a tool that supports other GNN flavors (convolutional, attention, and other message passing varieties), specialization can be applied that allows an optimized implementation of only the “Graph Nets” architecture as outlined here .

Did you try to implement the functionality you want using GNN.jl?

From a brief peek at the library and at the paper " Relational inductive biases, …" (which btw is mainly a position paper, the message passing it proposes is not much used in practice nor it is implemented by the main python graph library) your GNNBlock can be implemented in GNN.jl as (code not tested)

using GraphNeuralNetworks

struct GNBlock{E,N,G}
    edgefn::E
    nodefn::N
    graphfn::G
end

@functor GNBlock

function GNBlock(((ein, nin, gin), (eout, nout, gout)))
    edge_input_in = edge_in + 2*node_in + graph_in
    node_input_in = node_in + edge_out + graph_in
    graph_input_in = node_out + edge_out + graph_in
    return GNBlock(Dense(edge_input_in, edge_out),
                              Dense(node_input_in, node_out),
                              Dense(graph_input_in, graph_out))
end

function (m::GNBlock)(g::GNNGraph, x, e, u)
    e = vcat(e, broadcast_edges(g, u))
    e = apply_edges((xi, xj, e) -> vcat(xi, xj, e); x, e)
    e = m.edgefn(e)
    
    ē = aggregate_neighbors(g, mean, e)
    x = m.nodefn(vcat(x, ē, broadcast_nodes(g, u)))
    
    u = vcat(reduce_nodes(mean, g, x), reduce_edges(mean, g, e), u)
    u = m.graphfn(u)

    return x, e, u
end

This is something worthy of a PR to GNN.jl. If you felt some features were missing, it would have been good to open an issue, start a discussion …
Now we would have a library with 19 layers, instead of 2 libraries, one with 18 layers and one with 1 layer.

In particular, I think “GraphNets.jl” has a novel approach of batching together graphs of different structure (different adjacency matrices), that allows optimal training speed of mini-batches and also allows all matrix multiplications to take advantage of cuBLAS Strided Batched Matrix Multiply (which basically means using NNlib.batched_mul).

How does this batching works? Is it documented somewhere? One should benchmark the performance on standard tasks and datasets, e.g. the examples in

From my glimpse at the library, I suspect that the message passing scheme doesn’t scale well since it involves dense matrix multiplications. You cannot go beyond small/medium graphs with this approach.

Efficient message passing approaches are typically implemented in terms of gather/scatter operations or generalized sparse matrix multiplications.

6 Likes

Got it, apologies!

@JuliaMLTools @CarloLucibello Can you outline the state of GNNs in Julia for newcomers? How do GraphNeuralNetworks.jl and GeometricFlux.jl differ from each other, and how do these two differ from GraphNets.jl?

Furthermore, is there a big feature gap with Python, i.e. PyG et al?

GraphNeuralNetworks.jl is the most feature rich and actively maintained of the 3, you should use that.

For basic tasks I don’t think there is much feature/performance gap with PyG. More advanced tasks like very large graph training are not yet supported instead.

3 Likes

Thank you @gdalle for always bringing existing projects to the table. It is super important. :100:

1 Like

Thank you @CarloLucibello for organizing the house in our ML ecosystem. Your work is appreciated!

1 Like

I just want to mention that I would love your input for a section in Julia Package Comparisons for this domain. There is already this page on graphs, but AFAICT it is a different set of packages compared to graph neural networks.