Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost

LOVE is accpeted by ACL22 main conference as a long paper.
This is a Pytorch implementation of our paper.

Environment setup

Clone the repository and set up the environment via “requirements.txt”. Here we use python3.6.

pip install -r requirements.txt

Data preparation

In our experiments, we use the FastText as target vectors [1]. Downlaod.
After downloading, put the embedding file in the path data/


First you can use -help to show the arguments

python -help

Once completing the data preparation and environment setup, we can train the model via
We have also provided sample datasets, you can just run the mode without downloading.

python -dataset data/wiki_100.vec


To show the intrinsic results of our model, you can use the following command and
we have provided the trained model we used in our paper.



[1] Bojanowski, Piotr, et al. “Enriching word vectors with subword information.” Transactions of the Association for Computational Linguistics 5 (2017): 135-146.


View Github