TKCN
Most existing semantic segmentation methods employ atrous convolution to enlarge the receptive field of filters, but neglect important local contextual information. To tackle this issue, we firstly propose a novel Kronecker convolution which adopts Kronecker product to expand its kernel for taking into account the feature vectors neglected by atrous convolutions. Therefore, it can capture local contextual information and enlarge the field of view of filters simultaneously without introducing extra parameters. Secondly, we propose Tree-structured Feature Aggregation (TFA) module which follows a recursive rule to expand and forms a hierarchical structure. Thus, it can naturally learn representations of multi-scale objects and encode hierarchical contextual information in complex scenes. Finally, we design Tree-structured Kronecker Convolutional Network (TKCN) that employs Kronecker convolution and TFA module. Extensive experiments on three datasets, PASCAL VOC 2012, PASCAL-Context and Cityscapes, verify the effectiveness of our proposed approach.
Approach
Performance
For VOC 2012, we evaluate the proposed TKCN model on test set without external data such as COCO dataset.
For Cityscapes, the proposed TKCN only trains with the fine-labeled set.
Method | Conference | Backbone | PASCAL VOC 2012 test set | Cityscapes test set | PASCAL-Context val set |
---|---|---|---|---|---|
DeepLabv2 | - | ResNet-101 | 79.7 | 70.4 | 45.7 |
RefineNet | CVPR2017 | ResNet-101 | 82.4 | 73.6 | 47.1 |
SAC | ICCV2017 | ResNet-101 | - | 78.1 | - |
PSPNet | CVPR2017 | ResNet-101 | 82.6 | 78.4 | 47.8 |
DUC-HDC | WACV2018 | ResNet-101 | - | 77.6 | - |
AAF | ECCV2018 | ResNet-101 | 82.2 | 79.1 | - |
BiSeNet | ECCV2018 | ResNet-101 | - | 78.9 | - |
PSANet | ECCV2018 | ResNet-101 | - | 80.1 | - |
DeepLabv3+ | ECCV2018 | Xception | 89.0 | - | - |
DFN | CVPR2018 | ResNet-101 | 82.7 | 79.3 | - |
DSSPN | CVPR2018 | ResNet-101 | - | 77.8 | - |
CCL | CVPR2018 | ResNet-101 | - | - | 51.6 |
EncNet | CVPR2018 | ResNet-101 | 82.9 | - | 51.7 |
DenseASPP | CVPR2018 | DenseNet | - | 80.6 | - |
TKCN | - | ResNet-101 | 83.2 | 79.5 | 51.8 |
Note that: DeepLabv3+ employs a more powerful network (Xception) as the backbone and is pretrained on MS-COCO and JFT. "-" indicates that the approaches do not report the corresponding results. DenseASPP employs a more powerful backbone network (DenseNet).
Installation
- Install PyTorch
- The code is developed on python3.6.6 on Ubuntu 16.04. (GPU: Tesla K80; PyTorch: 0.5.0a0+a24163a; Cuda: 8.0)
- Clone the repository
git clone https://github.com/wutianyiRosun/TKCN.git cd TKCN python setup.py install
- Pretrained model
The pretrained model ImageNet_ResNet-101 can be available at here. Put it under the folder "./TKCN/tkcn/pretrained_models". - Dataset Configuration
- Download the Cityscapes dataset and convert the dataset to 19 categories. It should have this basic structure.
├── cityscapes_test_list.txt
├── cityscapes_train_list.txt
├── cityscapes_trainval_list.txt
├── cityscapes_val_list.txt
├── cityscapes_val.txt
├── gtCoarse
│ ├── train
│ ├── train_extra
│ └── val
├── gtFine
│ ├── test
│ ├── train
│ └── val
├── leftImg8bit
│ ├── test
│ ├── train
│ └── val
├── license.txt
- These .txt files can be downloaded from here
Train your own model
For Cityscapes
- training on train+val set
cd tkcn
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python train.py --model tkcnet --backbone resnet101
- single-scale testing (on test set)
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python eval.py --model tkcnet --backbone resnet101 --resume-dir cityscapes/model/tkcnet_model_resnet101_cityscapes_gpu6bs6epochs240/TKCNet101 --resume-file checkpoint_240.pth.tar
- multi-scale testing (on test set)
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python eval.py --model tkcnet --backbone resnet101 --multi-scales --resume-dir cityscapes/model/tkcnet_model_resnet101_cityscapes_gpu6bs6epochs240/TKCNet101 --resume-file checkpoint_240.pth.tar
- For testing, the pretrained model file can be downloaded here: tkcn_cityscapes_checkpoint_240_ontrainval.pth
Citation
If TKCN is useful for your research, please consider citing:
@article{wu2018tree,
title={Tree-structured Kronecker Convolutional Networks for Semantic Segmentation},
author={Wu, Tianyi and Tang, Sheng and Zhang, Rui and Li, Jintao},
journal={arXiv preprint arXiv:1812.04945},
year={2018}
}