Fast Face Classification (F²C)

This is the code of our paper An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicity.

Training on ultra-large-scale datasets is time-consuming and takes up a lot of hardware resource. Therefore we design a dul-data loaders and dynamic class pool to deal with large-scale face classification.

Fast Face Classification (F²C)

This is the code of our paper An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicity.

Training on ultra-large-scale datasets is time-consuming and takes up a lot of hardware resource. Therefore we design a dul-data loaders and dynamic class pool to deal with large-scale face classification.

Preparation

As FFC contains LRU module, so you may use lru_python_impl.py or instead
compile the code under lru_c directory.

If you choose lru_python_impl.py, you should rename lru_python_impl.py to lru_utils.py.
As lru is not the bottleneck of the training procedure, so feel free to use python implementation, though
the C++ implementation is 5~10 times faster than python version.

Compile LRU (optional)

Command to build LRU

cd lru_c
mkdir build
cd build
cmake ..
make
cd ../../ && ln -s lru_c/build/lru_utils.so .

You can compare this two implementation using lru_c/python/compare_time.py

Database

Training

In main.py, you should provide the path to your training db at line 152-153.

args.source_lmdb = ['/path to msceleb.lmdb']
args.source_file = ['/path to kv file']

We choose lmdb as the format of our training db. Each element in source_file is the path to a text file, each line of which represents lmdb_key label pairs.
You may refer to LFS
for more details.

Now you can modify train_ffc.sh. Before running the training, you should set the port number and queue_size.
queue_size is a trade-off term that controls the performance and the speed. Larger queue_size means higher performance at the cost of time and GPU resource.
It can be any positive integer. The common setting is 1%, 0.1%, 0.001 % of the total identities.

Notice

The difference between r50 and ir50 is that r50 requires 224 × 224 images as input while ir50 requires 112 × 112 as what does by ArcFace. The network ir50 comes from ArcFace.

Evaluation

We provide the whole test script under evaluation_code directory. Each script requires the directory to the images and test pair files.

Tips

Code in evaluation_code/test_megaface.py is much faster than official version. It's also applicable to extremely large-scale testing.

GitHub

https://github.com/anoymous-face/FFC