Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets


🚧This is preview version. Still in progress …



The code has been tested in the following environment:

Package Version
Python 3.8.12
PyTorch 1.10.1
CUDA 11.3.1
PyTorch Geometric 1.7.2
RDKit 2022.09.5

NOTE: Current implementation relies on PyTorch Geometric (PyG) < 2.0.0. We will fix compatability issues for the latest PyG version in the future.

Install via conda yaml file (cuda 11.3)

conda env create -f env_cuda113.yml
conda activate Pocket2Mol

Manually installation

conda create -n Pocket2Mol python=3.8
conda activate Pocket2Mol

conda install pytorch==1.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.10.1+cu113.html
pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.10.1+cu113.html
pip install torch-cluster -f https://data.pyg.org/whl/torch-1.10.1+cu113.html
pip install torch-geometric==1.7.2

conda install -c conda-forge rdkit
conda install pyyaml easydict python-lmdb -c conda-forge


Please refer to README.md in the data folder.


Sampling for pockets in the testset

To sample molecules for the i-th pocket in the testset, please first download the trained models following README.md in the ckpt folder. Then, run the following command:

python scripts/sample.py --data_id {i} --outdir ./outputs  # Replace {i} with the index of the data. i should be between 0 and 119 for the testset.

We recommend to specify the GPU device number and restrict the cpu cores using command like:

CUDA_VISIBLE_DIVICES=0  taskset -c 0 python scripts/sample.py --data_id 0 --outdir ./outputs

Sampling for PDB pockets





  title={Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets},
  author={Xingang Peng and Shitong Luo and Jiaqi Guan and Qi Xie and Jian Peng and Jianzhu Ma},
  booktitle={International Conference on Machine Learning},


View Github