Hawkeye

Hawkeye is a unified deep learning based fine-grained image recognition toolbox built on PyTorch, which is designed for researchers and engineers. Currently, Hawkeye contains representative fine-grained recognition methods of different paradigms, including utilizing deep filters, leveraging attention mechanisms, performing high-order feature interactions, designing specific loss functions, recognizing with web data, as well as miscellaneous.

Updates

Nov 01, 2022: Our Hawkeye is launched!

Model Zoo

The following methods are placed in model/methods and the corresponding losses are placed in model/loss.

Utilizing Deep Filters
Leveraging Attention Mechanisms
- MGE-CNN
- OSME+MAMC
- APCNN
Performing High-Order Feature Interactions
- BCNN
- CBCNN
- Fast MPN-COV
Designing Specific Loss Functions
Recognition with Web Data
- Peer-Learning
Miscellaneous
- NTS-Net
- CrossX
- DCL

Get Started

We provide a brief tutorial for Hawkeye.

Clone

git clone https://github.com/Hawkeye-FineGrained/Hawkeye.git
cd Hawkeye

Requirements

Python 3.8
PyTorch 1.11.0 or higher
torchvison 0.12.0 or higher
numpy
yacs
tqdm

Preparing Datasets

Eight representative fine-grained recognition benchmark datasets are provided as follows.

FGDataset name	Year	Meta-class	# images	# categories	Download Link
CUB-200	2011	Birds	11,788	200	https://data.caltech.edu/records/65de6-vp158/files/CUB_200_2011.tgz
Stanford Dog	2011	Dogs	20,580	120	http://vision.stanford.edu/aditya86/ImageNetDogs/images.tar
Stanford Car	2013	Cars	16,185	196	http://ai.stanford.edu/~jkrause/car196/car_ims.tgz
FGVC Aircraft	2013	Aircrafts	10,000	100	https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/archives/fgvc-aircraft-2013b.tar.gz
iNat2018	2018	Plants & Animals	61,333	8,142	https://ml-inat-competition-datasets.s3.amazonaws.com/2018/train_val2018.tar.gz
WebFG-bird	2021	Birds	18,388	200	https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-bird.tar.gz
WebFG-car	2021	Cars	21,448	196	https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-car.tar.gz
WebFG-aircraft	2021	Aircrafts	13,503	100	https://web-fgvc-496-5089-sh.oss-cn-shanghai.aliyuncs.com/web-aircraft.tar.gz

Downloading Datasets

You can download dataset to the data/ directory by conducting the following operations. We here take CUB-200 as an example.

cd Hawkeye/data
wget https://data.caltech.edu/records/65de6-vp158/files/CUB_200_2011.tgz
mkdir bird && tar -xvf CUB_200_2011.tgz -C bird/

We provide the meta-data file of the datasets in metadata/, and the train list and the val list are also provided according to the official splittings of the dataset. There is no need to modify the decompressed directory of the dataset. The following is an example of the directory structure of two datasets.

data
├── bird
│   ├── CUB_200_2011
│   │   ├── images
│   │   │   ├── 001.Black_footed_Albatross
│   │   │   │   ├── Black_Footed_Albatross_0001_796111.jpg
│   │   │   │   └── ··· 
│   │   │   └── ···
│   │   └── ···
│   └── ···
├── web-car
│   ├── train
│   │   ├── Acura Integra Type R 2001
│   │   │   ├── Acura Integra Type R 2001_00001.jpg
│   │   │   └── ···
│   ├── val
│   │   ├── Acura Integra Type R 2001
│   │   │   ├── 000450.jpg
│   │   │   └── ···
│   │   └── ···
│   └── ···
└── ···

Configuring Datasets

When using different datasets, you need to modify the dataset path in the corresponding config file. meta_dir is the path to the meta-data file which contains train list and val list. root_dir is the path to the image folder in data/. Here are two examples.

Note that the relative path in the meta-data list should match the path of root_dir.

dataset:
  name: cub
  root_dir: data/bird/CUB_200_2011/images
  meta_dir: metadata/cub

dataset:
  name: web_car
  root_dir: data/web-car
  meta_dir: metadata/web_car

Note that, for ProtoTree, it was trained on an offline augment dataset, refer to the link if needed. We just provide meta-data for the offline augmented cub-200 in metadata/cub_aug.

Training

For each method in the repo, we provide separate training example files in the Examples/ directory.

For example, the command to train an APINet:
```
python Examples/APINet.py --config configs/APINet.yaml
```
The default parameters of the experiment are shown in configs/APINet.yaml.

Some methods require multi-stage training.

For example, when training BCNN, two stages of training are required, cf. its two config files.

First, the first stage of model training is performed by:
```
python Examples/BCNN.py --config configs/BCNN_S1.yaml
```
Then, the second stage of training is performed later. You need to modify the weight path of the model (load in BCNN_S2.yaml) to load the model parameters obtained from the first stage of training, such as results/bcnn/bcnn_cub s1/best_model.pth.
```
python Examples/BCNN.py --config configs/BCNN_S2.yaml
```

In addition, specific parameters of each method are also commented in their configs.

License

This project is released under the MIT license.

Contacts

If you have any questions about our work, please do not hesitate to contact us by emails.

Xiu-Shen Wei: [email protected]

Jiabei He: [email protected]

Yang Shen: [email protected]

Acknowledgements

This project is supported by National Key R&D Program of China (2021YFA1001100), National Natural Science Foundation of China under Grant (62272231), Natural Science Foundation of Jiangsu Province of China under Grant (BK20210340), and the Fundamental Research Funds for the Central Universities (No. 30920041111, No. NJ2022028).

GitHub

View Github

Open source deep learning based fine-grained image recognition toolbox built on PyTorch

Hawkeye

Updates

Model Zoo

Get Started

Clone

Requirements

Preparing Datasets

Downloading Datasets

Configuring Datasets

Training

License

Contacts

Acknowledgements

GitHub

John

Creates videos by scraping information from Reddit and convert to video file

Supporting Materials for “Symbolic Triage” blog post

Hawkeye

Updates

Model Zoo

Get Started

Clone

Requirements

Preparing Datasets

Downloading Datasets

Configuring Datasets

Training

License

Contacts

Acknowledgements

GitHub

Creates videos by scraping information from Reddit and convert to video file

Supporting Materials for “Symbolic Triage” blog post

You might also like...