jiant is an NLP toolkit
The multitask and transfer learning toolkit for natural language processing research.
Why should I use jiant
?
jiant
supports multitask learningjiant
supports transfer learningjiant
supports 50+ natural language understanding tasksjiant
supports the following benchmarks:jiant
is a research library and users are encouraged to extend, change, and contribute to match their needs!
A few additional things you might want to know about jiant
:
jiant
is configuration file drivenjiant
is built with PyTorchjiant
integrates withdatasets
to manage task datajiant
integrates withtransformers
to manage models and tokenizers.
Getting Started
- Get started with some simple Examples
- Learn more about
jiant
by reading our Guides - See our list of supported tasks
Installation
To import jiant
from source (recommended for researchers):
git clone https://github.com/nyu-mll/jiant.git
cd jiant
pip install -r requirements.txt
# Add the following to your .bash_rc or .bash_profile
export PYTHONPATH=/path/to/jiant:$PYTHONPATH
If you plan to contribute to jiant, install additional dependencies with pip install -r requirements-dev.txt
.
To install jiant
from source (alternative for researchers):
git clone https://github.com/nyu-mll/jiant.git
cd jiant
pip install . -e
To install jiant
from pip (recommended if you just want to train/use a model):
pip install jiant
We recommended that you install jiant
in a virtual environment or a conda environment.
To check jiant
was correctly installed, run a simple example.
Quick Introduction
The following example fine-tunes a RoBERTa model on the MRPC dataset.
Python version:
from jiant.proj.simple import runscript as run
import jiant.scripts.download_data.runscript as downloader
EXP_DIR = "/path/to/exp"
# Download the Data
downloader.download_data(["mrpc"], f"{EXP_DIR}/tasks")
# Set up the arguments for the Simple API
args = run.RunConfiguration(
run_name="simple",
exp_dir=EXP_DIR,
data_dir=f"{EXP_DIR}/tasks",
hf_pretrained_model_name_or_path="roberta-base",
tasks="mrpc",
train_batch_size=16,
num_train_epochs=3
)
# Run!
run.run_simple(args)
Bash version:
EXP_DIR=/path/to/exp
python jiant/scripts/download_data/runscript.py \
download \
--tasks mrpc \
--output_path ${EXP_DIR}/tasks
python jiant/proj/simple/runscript.py \
run \
--run_name simple \
--exp_dir ${EXP_DIR}/ \
--data_dir ${EXP_DIR}/tasks \
--hf_pretrained_model_name_or_path roberta-base \
--tasks mrpc \
--train_batch_size 16 \
--num_train_epochs 3
Examples of more complex training workflows are found here.