This repo provides the PyTorch implementation of the work:

Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji. GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. In CVPR 2021.


  • Ubuntu 16.04/18.04, CUDA 10.1/10.2, python >= 3.6, PyTorch >= 1.6, torchvision
  • Install detectron2 from source
  • sh scripts/
  • Compile the cpp extension for farthest points sampling (fps):
    sh core/csrc/


Download the 6D pose datasets (LM, LM-O, YCB-V) from the
BOP website and
VOC 2012
for background images.
Please also download the image_sets and test_bboxes from
here (BaiduNetDisk, OneDrive, password: qjfk).

The structure of datasets folder should look like below:

# recommend using soft links (ln -sf)
├── lm_imgn  # the OpenGL rendered images for LM, 1k/obj
├── lm_renders_blender  # the Blender rendered images for LM, 10k/obj (pvnet-rendering)
├── VOCdevkit

Training GDR-Net

./core/gdrn_modeling/ <config_path> <gpu_ids> (other args)


./core/gdrn_modeling/ configs/gdrn/lm/ 0  # multiple gpus: 0,1,2,3
# add --resume if you want to resume from an interrupted experiment.

Our trained GDR-Net models can be found here (BaiduNetDisk, OneDrive, password: kedv).

(Note that the models for BOP setup in the supplement were trained using a refactored version of this repo (not compatible), they are slightly better than the models provided here.)


./core/gdrn_modeling/ <config_path> <gpu_ids> <ckpt_path> (other args)


./core/gdrn_modeling/ configs/gdrn/lmo/ 0 output/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e/gdrn_lmo_real_pbr.pth


If you find this useful in your research, please consider citing:

    title     = {{GDR-Net}: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation},
    author    = {Wang, Gu and Manhardt, Fabian and Tombari, Federico and Ji, Xiangyang},
    booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {16611-16621}