SeMask: Semantically Masked Transformers

Framework: PyTorch

Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi

This repo contains the code for our paper SeMask: Semantically Masked Transformers for Semantic Segmentation.

semask

Contents

  1. Results
  2. Setup Instructions
  3. Citing SeMask

1. Results

Note: † denotes the backbones were pretrained on ImageNet-22k and 384×384 resolution images.

ADE20K

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 512×512 42.11 43.16 35M config TBD
SeMask-S FPN SeMask Swin-S 512×512 45.92 47.63 56M config TBD
SeMask-B FPN SeMask Swin-B 512×512 49.35 50.98 96M config TBD
SeMask-L FPN SeMask Swin-L 640×640 51.89 53.52 211M config TBD
SeMask-L MaskFormer SeMask Swin-L 640×640 54.75 56.15 219M config TBD
SeMask-L Mask2Former SeMask Swin-L 640×640 56.41 57.52 222M config TBD
SeMask-L Mask2Former FAPN SeMask Swin-L 640×640 56.68 58.00 227M config TBD
SeMask-L Mask2Former MSFAPN SeMask Swin-L 640×640 56.54 58.22 224M config TBD

Cityscapes

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 768×768 74.92 76.56 34M config TBD
SeMask-S FPN SeMask Swin-S 768×768 77.13 79.14 56M config TBD
SeMask-B FPN SeMask Swin-B 768×768 77.70 79.73 96M config TBD
SeMask-L FPN SeMask Swin-L 768×768 78.53 80.39 211M config TBD
SeMask-L Mask2Former SeMask Swin-L 512×1024 83.97 84.98 222M config TBD

COCO-Stuff 10k

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 512×512 37.53 38.88 35M config TBD
SeMask-S FPN SeMask Swin-S 512×512 40.72 42.27 56M config TBD
SeMask-B FPN SeMask Swin-B 512×512 44.63 46.30 96M config TBD
SeMask-L FPN SeMask Swin-L 640×640 47.47 48.54 211M config TBD

demo

2. Setup Instructions

We provide the codebase with SeMask incorporated into various models. Please check the setup instructions inside the corresponding folders:

3. Citing SeMask

@article{jain2022semask,
  title={SeMask: Semantically Masking Transformer Backbones for Effective Semantic Segmentation},
  author={Jitesh Jain and Anukriti Singh and Nikita Orlov and Zilong Huang and Jiachen Li and Steven Walton and Humphrey Shi},
  journal={arXiv preprint arXiv:...},
  year={2022}
}

Acknowledgements

Code is based heavily on the following repositories: Swin-Transformer-Semantic-Segmentation, Mask2Former, and FaPN-full.

GitHub

GitHub - Picsart-AI-Research/SeMask-Segmentation at pythonawesome.com
[Preprint] SeMask: Semantically Masked Transformers for Semantic Segmentation. - GitHub - Picsart-AI-Research/SeMask-Segmentation at pythonawesome.com