Probabilistic Gradient Boosting Machines (PGBM) is a probabilistic gradient boosting framework in Python based on PyTorch, developed by Airlab in Amsterdam. It provides the following advantages over existing frameworks:
- Probabilistic regression estimates instead of only point estimates.
- Auto-differentiation of custom loss functions.
- Native GPU-acceleration.
It is aimed at users interested in solving large-scale tabular probabilistic regression problems, such as probabilistic time series forecasting. For more details, read our paper or check out the examples.
pip install pgbm from a terminal within the virtual environment of your choice.
- Download & run an example from the examples folder to verify the installation is correct. Use both
cpuas device to check if you are able to train on both GPU and CPU.
- Note that when training on the GPU, the custom CUDA kernel will be JIT-compiled when initializing a model. Hence, the first time you train a model on the GPU it can take a bit longer, as PGBM needs to compile the CUDA kernel.
- When using the Numba-backend, several functions need to be JIT-compiled. Hence, the first time you train a model using this backend it can take a bit longer.
The core package has the following dependencies:
- PyTorch >= 1.7.0, with CUDA 11.0 for GPU acceleration (https://pytorch.org/get-started/locally/)
- Numpy >= 1.19.2 (install via
- CUDA Toolkit 11.0 (or one matching your PyTorch distribution) (https://developer.nvidia.com/cuda-toolkit)
- PGBM uses a custom CUDA kernel which needs to be compiled, which may require installing a suitable compiler. Installing PyTorch and the full CUDA Toolkit should be sufficient, but contact the author if you find it still not working even after installing these dependencies.
- To run the experiments comparing against baseline models a number of additional packages may need to be installed via
We also provide PGBM based on a Numba backend for those users who do not want to use PyTorch. In that case, it is required to install Numba. The Numba backend does not support differentiable loss functions. For an example of using PGBM with the Numba backend, see the examples.