Jittor implementation of Vision Transformer with Deformable Attention

Mar 04, 2022 1 min read

Vision Transformer with Deformable Attention (Jittor)

This repository contains a simple implementation of Vision Transformer with Deformable Attention [arXiv].

Currently, we only release the code of models and the training scripts are under development including advance data augmentations and mixed precision training.

Pytorch version: Github

Dependencies

NVIDIA GPU + CUDA 11.1 + cuDNN 8.0.3
Python 3.7 (Recommend to use Anaconda)
jittor == 1.3.1.40
jimm

TODO

Training scripts with advance data augmentations.

Citation

If you find our work is useful in your research, please consider citing:

@misc{xia2022vision,
      title={Vision Transformer with Deformable Attention}, 
      author={Zhuofan Xia and Xuran Pan and Shiji Song and Li Erran Li and Gao Huang},
      year={2022},
      eprint={2201.00520},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contact

[email protected]

GitHub

View Github

Transformer Computer Vision

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.