Pylomin

Pylomin (PYtorch LOw-Memory INference) is a library for low-memory inferencing in PyTorch.

Installation

Usage

For example, the following code snippet converts all nn.Linear and nn.Embedding modules in the model into lazy-loading mode.

model = pylomin.lazy_loading(model, target_instances=(nn.Linear, nn.Embedding))

Or, provide a list of modules via target_modules.

model = pylomin.lazy_loading(model, target_modules=your_target_modules)

A detailed documentation is being prepared! 🙂

Methods

1. Lazy-loading

2. Grouped-embedding

3. Prefetching

GitHub

View Github