This project is an implementation of 'Self-supervised Graph-level Representation Learning with Local and Global Structure' in PyTorch, which is accepted as Short Talk by ICML 2021. We provide the pre-training and fine-tuning codes and also the pre-trained model on chemistry domain in this repository, and a more complete code version including the biology domain will be announced on the TorchDrug platform developed by MilaGraph group. Also, we would like to appreciate the excellent work of Pretrain-GNNs which lays a solid foundation for our work.

More details of this work can be found in our paper: [Paper (arXiv)].


We develop this project with Python3.6 and following Python packages:

Pytorch                   1.1.0
torch-cluster             1.4.5                    
torch-geometric           1.0.3                    
torch-scatter             1.4.0                    
torch-sparse              0.4.4                    
torch-spline-conv         1.0.6 
rdkit                     2019.03.1

P.S. In our project, these packages can be successfully installed and work together under CUDA/9.0 and cuDNN/7.0.5.

Dataset Preparation

In the root direction of this project, create a folder for storing datasets:

mkdir dataset

The pre-training and fine-tuning datasets on chemistry domain can be downloaded from the project page of Pretrain-GNNs.


To pre-train with the proposed GraphLoG method, simply run:

python --output_model_file $pre-trained_model$


To fine-tune on a downstream dataset, simply run (five independent runs will perform):

python --input_model_file $pre-trained_model$ \
                   --dataset $downstream_dataset$

Pretrained Model

We provide the GIN model pre-trained by GraphLoG at ./models/.