Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
Paper (CVPR 2021)
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
Updates
- (02/03/2021) Higher performance is reported by using stronger backbone model PVT.
- (23/02/2021) Higher performance is reported by using stronger pretrain model DetCo.
- (02/12/2020) Models and logs(R101_100pro_3x and R101_300pro_3x) are available.
- (26/11/2020) Models and logs(R50_100pro_3x and R50_300pro_3x) are available.
- (26/11/2020) Higher performance for Sparse R-CNN is reported by setting the dropout rate as 0.0.
Models
Method | inf_time | train_time | box AP | download |
---|---|---|---|---|
R50_100pro_3x | 23 FPS | 19h | 42.8 | model | log |
R50_300pro_3x | 22 FPS | 24h | 45.0 | model | log |
R101_100pro_3x | 19 FPS | 25h | 44.1 | model | log |
R101_300pro_3x | 18 FPS | 29h | 46.4 | model | log |
Models and logs are available in Baidu Drive by code wt9n.
Notes
- We observe about 0.3 AP noise.
- The training time is on 8 GPUs with batchsize 16. The inference time is on single GPU. All GPUs are NVIDIA V100.
- We use the models pre-trained on imagenet using torchvision. And we provide torchvision's ResNet-101.pkl model. More details can be found in the conversion script.
Method | inf_time | train_time | box AP | codebase |
---|---|---|---|---|
R50_300pro_3x | 22 FPS | 24h | 45.0 | detectron2 |
R50_300pro_3x.detco | 22 FPS | 28h | 46.5 | detectron2 |
PVTSmall_300pro_3x | 13 FPS | 50h | 45.7 | mmdetection |
PVTv2-b2_300pro_3x | 11 FPS | 76h | 50.1 | mmdetection |
Installation
The codebases are built on top of Detectron2 and DETR.
Requirements
- Linux or macOS with Python ≥ 3.6
- PyTorch ≥ 1.5 and torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this
- OpenCV is optional and needed by demo and visualization
Steps
- Install and build libs
git clone https://github.com/PeizeSun/SparseR-CNN.git
cd SparseR-CNN
python setup.py build develop
- Link coco dataset path to SparseR-CNN/datasets/coco
mkdir -p datasets/coco
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2017 datasets/coco/train2017
ln -s /path_to_coco_dataset/val2017 datasets/coco/val2017
- Train SparseR-CNN
python projects/SparseRCNN/train_net.py --num-gpus 8 \
--config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml
- Evaluate SparseR-CNN
python projects/SparseRCNN/train_net.py --num-gpus 8 \
--config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
--eval-only MODEL.WEIGHTS path/to/model.pth
- Visualize SparseR-CNN
python demo/demo.py\
--config-file projects/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml \
--input path/to/images --output path/to/save_images --confidence-threshold 0.4 \
--opts MODEL.WEIGHTS path/to/model.pth
Third-party resources
- mmdetection implementation: sparse_rcnn. Thank Shilong Zhang!
- cvpod implementation:sparse_rcnn. Thank Benjin Zhu!
- paddledetection implementation:sparse_rcnn. Thank FL77N!
License
SparseR-CNN is released under MIT License.