small-text

Active Learning for Text Classifcation in Python.

Active Learning allows you to efficiently label training data in a small-data scenario.
This library provides state-of-the-art active learning for text classification which allows to easily mix and match many classifiers and query strategies to build active learning experiments or applications.

Features

  • Provides unified interfaces for Active Learning so that you can easily use any classifier provided by sklearn.
  • (Optionally) As an optional feature, you can also use pytorch classifiers, including transformers models.
  • Multiple scientifically-proven strategies re-implemented: Query Strategies, Initialization Strategies

Installation

Small-text can be easily installed via pip:

pip install small-text

For a full installation include the transformers extra requirement:

pip install small-text[transformers]

Requires Python 3.7 or newer. For using the GPU, CUDA 10.1 or newer is required.
More information regarding the installation can be found in the
documentation.

Quick Start

For a quick start, see the provided examples for binary classification,
pytorch multi-class classification, or
transformer-based multi-class classification

Documentation

Read the latest documentation (currently work in progress) here.

GitHub

https://github.com/webis-de/small-text