PyTorch implementation of the paper "FcaNet: Frequency Channel Attention Networks".
Models pretrained on ImageNet can be simply accessed by (without any configuration or installation):
model = torch.hub.load('cfzd/FcaNet', 'fca34' ,pretrained=True) model = torch.hub.load('cfzd/FcaNet', 'fca50' ,pretrained=True) model = torch.hub.load('cfzd/FcaNet', 'fca101' ,pretrained=True) model = torch.hub.load('cfzd/FcaNet', 'fca152' ,pretrained=True)
Due to the conversion between FP16 training and the provided FP32 models, the evaluation results are slightly different(max -0.06%/+0.05%) compared with the reported results.
To evaluate, run
export NGPUS=4 python -m torch.distributed.launch --nproc_per_node=$NGPUS main.py \ -e \ --b 128 \ --dali_cpu \ -a fcanet34 \ # also can be 50,101,152 --evaluate_model /path/to/your/downloaded/model \ /path/to/your/ImageNet
Or please see
Since the paper is uploaded to arxiv, many academic peers ask us: the proposed DCT basis can be viewed as a simple tensor, then how about learning the tensor directly? Why use DCT instead of learnable tensor? Learnable tensor can be better than DCT.
Our concrete answer is: the proposed DCT is better than the learnable way, although it is counter-intuitive.
|Method||ImageNet Top-1 Acc||Link|
|Learnable tensor, random initialization||77.914||GoogleDrive/BaiduDrive(code:p2hl)|
|Learnable tensor, DCT initialization||78.352||GoogleDrive/BaiduDrive(code:txje)|
|Fixed tensor, random initialization||77.742||GoogleDrive/BaiduDrive(code:g5t9)|
|Fixed tensor, DCT initialization (Ours)||78.574||GoogleDrive/BaiduDrive(code:mgkk)|
To verify this results, one can select the cooresponding types of tensor in the L73-L83 in
model/layer.py, uncomment it and train the whole network.
- [ ] Object detection models
- [ ] Instance segmentation models
- [ ] Make the switching between configs more easier