How is the performance of GraphNeuralNetworks.jl compared to PytorchGeometric?

I’ve been playing around with the excellent GraphNeuralNetworks.jl and am thinking about applying it to the biochemistry domain. Specifically within drug discovery. I was wondering if anyone has tried to benchmark speed on CPU and/or GPU for training the same network as compared to Pytorch Geometric?

Also if there are any gotchas that I should be on the lookout for? The tasks I’m thinking about are both node level predictions but also full graph predictions.

Sorry for the vague question. :slight_smile:

//Mike

2 Likes

It would be nice to compare performance. PyG has plenty of examples, we should try to translate some of them and compare times. Do you have some specific dataset in mind?
Maybe this example of graph regression on the QM9 datasets could be a good start

?
Set2Set has to be implemented yet but it should be easy.

1 Like

Yes the QM9 is super relevant. I’ll try to port it. :blush::pray:t2:

We can omit the Complete transformation.
I opened a PR for Set2Set

The other ingredient is QM9. Hopefully it can be loaded from

If that doesn’t work, you can try with

julia> using HuggingFaceDatasets

julia> d = load_dataset("lisn519010/QM9", split="full").with_format("julia")
Dataset({
    features: ['x', 'edge_index', 'edge_attr', 'y', 'pos', 'z', 'name', 'idx'],
    num_rows: 130831
})
1 Like

I found that QM9 is already available through MLDatasets.jl

using MLDatasets
data = TUDataset("QM9")

But it doesn’t have the same number of rows as I would expect. I get 129,433 instead of 133,886. Not sure why. In the file qm9.csv which I think is the original data we have 133,886 rows.

But on TUDatasets website the qm9.zip indeed shows 129,433 graphs.

I’ll double check which one Pytorch Geometric is using in their example.

1 Like