Bottleneck Transformers for Visual Recognition

Experiments

Model Params (M) Acc (%)
ResNet50 baseline (ref) 23.5M 93.62
BoTNet-50 18.8M 95.11%
BoTNet-S1-50 18.8M 95.67%
BoTNet-S1-59 27.5M 95.98%
BoTNet-S1-77 44.9M wip

Summary

스크린샷 2021-01-28 오후 4 50 19

Usage (example)

  • Model
from model import Model

model = ResNet50(num_classes=1000, resolution=(224, 224))
x = torch.randn([2, 3, 224, 224])
print(model(x).size())
  • Module
from model import MHSA

resolution = 14
mhsa = MHSA(planes, width=resolution, height=resolution)

Reference

  • Paper link
  • Author: Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani
  • Organization: UC Berkeley, Google Research

GitHub

https://github.com/leaderj1001/BottleneckTransformers