BitPack is a practical tool that can efficiently save quantized neural network models with mixed bitwidth.
- PyTorch version >= 1.4.0
- Python version >= 3.5
- To install Bitpack simply run:
git clone https://github.com/Zhen-Dong/BitPack.git cd BitPack
- We can use BitPack pack.py to save integer checkpoints with various bitwidth, and use BitPack unpack.py to load the packed checkpoint, as shown in the demo.
- To pack integer values that are saved in floating point format, add --force-pack-fp in the command.
- To directly save packed checkpoint in PyTorch, please use save_quantized_state_dict() and load_quantized_state_dict() in pytorch_interface.py. If you don't want to operate jointly on state_dict, then codes inside the for loop of those two functions can be applied on every quantized tensor (ultra low-precision integer tensors) in various quantization frameworks.
BitPack is handy to use on various quantization frameworks. Here we show a demo that applying BitPack to save mixed-precision model generated by HAWQ.
export CUDA_VISIBLE_DEVICES=0 python pack.py --input-int-file quantized_checkpoint.pth.tar --force-pack-fp python unpack.py --input-packed-file packed_quantized_checkpoint.pth.tar --original-int-file quantized_checkpoint.pth.tar
To get a better sense of how BitPack works, we provide a simple test that compares the original tensor, the packed tensor, and the unpacked tensor in details.
cd bitpack python bitpack_utils.py
Results of BitPack on ResNet50
|Original Precision||Quantization||Original Size(MB)||Packed Size(MB)||Compression Ratio|
- unpack.py can be used for checking correctness. It loads and unpacks the packed model, and then compares it with the original model.