PyTorch functions to improve performance, analyse and make your deep learning life easier.

torchfunc is library revolving around PyTorch with a goal to help you with:

  • Improving and analysing performance of your neural network (e.g. Tensor Cores compatibility)
  • Record/analyse internal state of torch.nn.Module as data passes through it
  • Do the above based on external conditions (using single Callable to specify it)
  • Day-to-day neural network related duties (model size, seeding, performance measurements etc.)
  • Get information about your host operating system, CUDA devices and others

Quick examples

  • Get instant performance tips about your module. All problems described by comments
    will be shown by
class Model(torch.nn.Module):
    def __init__(self):
        self.convolution = torch.nn.Sequential(
            torch.nn.Conv2d(1, 32, 3),
            torch.nn.ReLU(inplace=True),  # Inplace may harm kernel fusion
            torch.nn.Conv2d(32, 128, 3, groups=32),  # Depthwise is slower in PyTorch
            torch.nn.ReLU(inplace=True),  # Same as before
            torch.nn.Conv2d(128, 250, 3),  # Wrong output size for TensorCores

        self.classifier = torch.nn.Sequential(
            torch.nn.Linear(250, 64),  # Wrong input size for TensorCores
            torch.nn.ReLU(),  # Fine, no info about this layer
            torch.nn.Linear(64, 10),  # Wrong output size for TensorCores

    def forward(self, inputs):
        convolved = torch.nn.AdaptiveAvgPool2d(1)(self.convolution(inputs)).flatten()
        return self.classifier(convolved)

# All you have to do
  • Seed globaly (including numpy and cuda), freeze weights, check inference time and model size:
# Inb4 MNIST, you can use any module with those functions
model = torch.nn.Linear(784, 10)
frozen = torchfunc.module.freeze(model, bias=False)

with torchfunc.Timer() as timer:
  frozen(torch.randn(32, 784)
  print(timer.checkpoint()) # Time since the beginning
  frozen(torch.randn(128, 784)
  print(timer.checkpoint()) # Since last checkpoint
print(f"Overall time {timer}; Model size: {torchfunc.sizeof(frozen)}")
  • Record and sum per-layer activation statistics as data passes through network:
# Still MNIST but any module can be put in it's place
model = torch.nn.Sequential(
    torch.nn.Linear(784, 100),
    torch.nn.Linear(100, 50),
    torch.nn.Linear(50, 10),
# Recorder which sums all inputs to layers
recorder = torchfunc.hooks.recorders.ForwardPre(reduction=lambda x, y: x+y)
# Record only for torch.nn.Linear
recorder.children(model, types=(torch.nn.Linear,))
# Train your network normally (or pass data through it)
# Activations of all neurons of first layer! 
print(recorder[1]) # You can also post-process this data easily with apply

For other examples (and how to use condition), see documentation



Latest release:

pip install --user torchfunc


pip install --user torchfunc-nightly


CPU standalone and various versions of GPU enabled images are available
at dockerhub.

For CPU quickstart, issue:

docker pull szymonmaszke/torchfunc:18.04

Nightly builds are also available, just prefix tag with nightly_. If you are going for GPU image make sure you have
nvidia/docker installed and it's runtime set.