Semantic Segmentation PyTorch code for our paper: Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition (

The results on the ADE20K validation set of our PyConvSegNet (using multi-scale inference):

Backbone mean IoU pixel Acc.
ResNet-50 42.88% 80.97% (model)
PyConvResNet-50 43.31% 81.18% (model)
ResNet-101 44.39% 81.60% (model)
PyConvResNet-101 44.58% 81.77% (model)
ResNet-152 45.28% 81.89% (model)
PyConvResNet-152 45.64% 82.36% (model)

Our single model top result (mIoU=39.13, pAcc=73.91, score=56.52) on the testing set is obtained with PyConvResNet-152 as backbone and performing the training on train+val sets over 120 epochs


Install PyTorch pip install -r requirements.txt

A fast alternative (without the need to install PyTorch and other deep learning libraries) is to use NVIDIA-Docker,
we used this container image.

Download the ImageNet pretrained models and add the corresponding path to the config .yaml file.

Download the ADE20K dataset. (note that this code uses label id starting from 0, while the original ids start from 1, thus, you need to preprocess the original labels by subtracting 1)

Training and Inference

To train a model on ADE20K dataset, for instance, using PyConvResNet with 50 layers as backbone (note that you need to update the config file, for instance, config/ade20k/pyconvresnet50_pyconvsegnet.yaml):

./tool/ ade20k pyconvresnet50_pyconvsegnet

Run the inference on the validation set (also update the config/ade20k/pyconvresnet50_pyconvsegnet.yaml file for the TEST part):

./tool/ ade20k pyconvresnet50_pyconvsegnet


If you find our work useful, please consider citing:

  author  = {Ionut Cosmin Duta and Li Liu and Fan Zhu and Ling Shao},
  title   = {Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition},
  journal = {arXiv preprint arXiv:2006.11538},
  year    = {2020},


This code is based on this repository. We thank the authors for open-sourcing their code.