Finetune ships with a pre-trained language model from "Improving Language Understanding by Generative Pre-Training" and builds off the OpenAI/finetune-language-model repository. Huge thanks to Alec Radford for his hard work and quality research.

Finetune Quickstart Guide

Finetuning the base language model is as easy as calling

model = Classifier()               # Load base model, trainY)          # Finetune base model on custom data
predictions = model.predict(testX) # [{'class_1': 0.23, 'class_2': 0.54, ..}, ..]                   # Serialize the model to disk

Reload saved models from disk by using LanguageModelClassifier.load:

model = Classifier.load(path)
predictions = model.predict(testX)


Finetune can be installed directly from PyPI by using pip

pip3 install finetune

or installed directly from source:

git clone
cd finetune
python3 develop
python3 -m spacy download en

In order to run finetune on your host, you'll need a working copy of CUDA >= 8.0, libcudnn >= 6, tensorflow-gpu >= 1.6 and up to date nvidia-driver versions.

You can optionally run the provided test suite to ensure installation completed successfully.

pip3 install pytest


If you'd prefer you can also run finetune in a docker container. The bash scripts provided assume you have a functional install of docker and nvidia-docker.

./docker/      # builds a docker image
./docker/      # starts a docker container in the background
docker exec -it finetune bash # starts a bash session in the docker container

Code Examples

For example usage of Classifier, Entailment, and SequenceLabeler, see the finetune/datasets directory. For purposes of simplicity and runtime these examples use smaller versions of the published datasets.