FcaNet

PyTorch implementation of the paper "FcaNet: Frequency Channel Attention Networks".

Simplest usage

Models pretrained on ImageNet can be simply accessed by (without any configuration or installation):

model = torch.hub.load('cfzd/FcaNet', 'fca34' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca50' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca101' ,pretrained=True)
model = torch.hub.load('cfzd/FcaNet', 'fca152' ,pretrained=True)

Evaluation

Due to the conversion between FP16 training and the provided FP32 models, the evaluation results are slightly different(max -0.06%/+0.05%) compared with the reported results.

Model Reported Evaluation Results Link
Fca34 75.07 75.02 GoogleDrive/BaiduDrive(code:m7v8)
Fca50 78.52 78.57 GoogleDrive/BaiduDrive(code:mgkk)
Fca101 79.64 79.63 GoogleDrive/BaiduDrive(code:8t0j)
Fca152 80.08 80.02 GoogleDrive/BaiduDrive(code:5yeq)

To evaluate, run

export NGPUS=4

python -m torch.distributed.launch --nproc_per_node=$NGPUS main.py \
 -e \
 --b 128 \
 --dali_cpu \
 -a fcanet34 \ # also can be 50,101,152
 --evaluate_model /path/to/your/downloaded/model \
 /path/to/your/ImageNet

Or please see launch_eval.sh

FAQ

Since the paper is uploaded to arxiv, many academic peers ask us: the proposed DCT basis can be viewed as a simple tensor, then how about learning the tensor directly? Why use DCT instead of learnable tensor? Learnable tensor can be better than DCT.

Our concrete answer is: the proposed DCT is better than the learnable way, although it is counter-intuitive.

Method ImageNet Top-1 Acc Link
Learnable tensor, random initialization 77.914 GoogleDrive/BaiduDrive(code:p2hl)
Learnable tensor, DCT initialization 78.352 GoogleDrive/BaiduDrive(code:txje)
Fixed tensor, random initialization 77.742 GoogleDrive/BaiduDrive(code:g5t9)
Fixed tensor, DCT initialization (Ours) 78.574 GoogleDrive/BaiduDrive(code:mgkk)

To verify this results, one can select the cooresponding types of tensor in the L73-L83 in model/layer.py, uncomment it and train the whole network.

Training

Please see launch_training.sh

TODO

  • [ ] Object detection models
  • [ ] Instance segmentation models
  • [ ] Make the switching between configs more easier

GitHub