Prediction-Guided Distillation

PyTorch implementation of our paper: Prediction-Guided Distillation for Dense Object Detection


  • Our codebase is built on top of MMDetection, which can be installed following the offcial instuctions.
  • We used pytorch pre-trained ResNets for training.
  • Please follow the MMdetection offcial instuction to set up COCO dataset.
  • Please download the CrowdHuman and set up the dataset by running this script.


Set up datasets and pre-trained models

mkdir data
ln -s path_to_coco data/coco
ln -s path_to_crowdhuman data/crowdhuman 
ln -s path_to_pretrainedModel data/pretrain_models 

COCO Experiments

# ------------------------------------
#    Here we use ATSS as an example
# ------------------------------------

# Training and testing teacher model
zsh tools/ work_configs/detectors/ 8
zsh tools/ work_configs/detectors/ work_dirs/atss_r101_3x_ms/latest.pth 8

# Training and testing student model 
zsh tools/ work_configs/detectors/ 8
zsh tools/ work_configs/detectors/ work_dirs/atss_r50_1x/latest.pth 8

# Training and testing PGD model
zsh tools/ work_configs/ 8
zsh tools/ work_configs/ work_dirs/pgd_atss_r101_r50_1x/latest.pth 8

CrowdHuman Experiments

# Training teacher, conducting KD, and evalauation
zsh tools/

Model Zoo


Detector Setting mAP Config
FCOS Teacher (r101, 3x, multi-scale) 43.1 config
Student (r50, 1x, single-scale) 38.2 config
PGD (r50, 1x, single-scale) 42.5 (+4.3) config
AutoAssign Teacher (r101, 3x, multi-scale) 44.8 config
Student (r50, 1x, single-scale) 40.6 config
PGD (r50, 1x, single-scale) 43.8 (+3.1) config
ATSS Teacher (r101, 3x, multi-scale) 45.5 config
Student (r50, 1x, single-scale) 39.6 config
PGD (r50, 1x, single-scale) 44.2 (+4.6) config
GFL Teacher (r101, 3x, multi-scale) 45.8 config
Student (r50, 1x, single-scale) 40.2 config
PGD (r50, 1x, single-scale) 43.8 (+3.6) config
DDOD Teacher (r101, 3x, multi-scale) 46.6 config
Student (r50, 1x, single-scale) 42.0 config
PGD (r50, 1x, single-scale) 45.4 (+3.4) config


Detector Setting MR ↓ AP ↑ JI ↑ Config
DDOD Teacher (r101, 36 epoch, multi-scale) 41.4 90.2 81.4 config
Student (r50, 12 epoch, single-scale) 46.0 88.0 79.0 config
PGD (r50, 12 epoch, single-scale) 42.8 (-3.2) 90.0 (+2.0) 80.7 (+1.7) config


  title={{Prediction-Guided Distillation for Dense Object Detection}},
  author={Yang, Chenhongyi and Ochal, Mateusz and Storkey, Amos and Crowley, Elliot J},
  journal={arXiv preprint arXiv:2203.05469},


We thank FGD and DDOD for their code base.


View Github