Fast Face Classification (F²C)
This is the code of our paper An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicity.
Training on ultra-large-scale datasets is time-consuming and takes up a lot of hardware resource. Therefore we design a dul-data loaders and dynamic class pool to deal with large-scale face classification.
As FFC contains LRU module, so you may use lru_python_impl.py or instead compile the code under lru_c directory.
If you choose lru_python_impl.py, you should rename lru_python_impl.py to lru_utils.py. As lru is not the bottleneck of the training procedure, so feel free to use python implementation, though the C++ implementation is 5~10 times faster than python version.
Compile LRU (optional)
Command to build LRU
cd lru_c mkdir build cd build cmake .. make cd ../../ && ln -s lru_c/build/lru_utils.so .
You can compare this two implementation using lru_c/python/compare_time.py
In main.py, you should provide the path to your training db at line 152-153.
args.source_lmdb = ['/path to msceleb.lmdb'] args.source_file = ['/path to kv file']
We choose lmdb as the format of our training db. Each element in
source_file is the path to a text file, each line of which represents
lmdb_key label pairs. You may refer to LFS for more details.
Now you can modify train_ffc.sh. Before running the training, you should set the port number and queue_size.
queue_size is a trade-off term that controls the performance and the speed. Larger
queue_size means higher performance at the cost of time and GPU resource. It can be any positive integer. The common setting is 1%, 0.1%, 0.001 % of the total identities.
We provide the whole test script under evaluation_code directory. Each script requires the directory to the images and test pair files.
evaluation_code/test_megaface.py is much faster than official version. It's also applicable to extremely large-scale testing.