CRL_EGPG

Pytorch Implementation of Contrastive Representation Learning for Exemplar-Guided Paraphrase Generation

We use contrastive loss implemented by HobbitLong.

How to train

  1. download the dataset from here and put it to project directory.
    You can directly use preprocessed dataset(data/: QQP-Pos, data2: ParaNMT)
    Or process them (Quora and Para) by your own through quora_process.py and para_process.py respectively.
    If you take the second method, you need to set the variable text_path in the above two programs.
  2. python train.py --datasets quora --model_save_path directory_to_save_model

How to evaluate

  1. Firstly, generate the test target sentences by running
    python evaluate --model_save_path your_saved_model --idx which_model_you_want_to_test
    After running the command, you will find the generated target file trg_genidx.txt and corresponding exemplar file exmidx.txt
  2. Follow the provided by malllabiisc.
    and setup the evaluation code. Then run
    python -m src.evaluation.eval -i path/trg_genidx.txt -r path/test_trg.txt -t path/exmidx.txt
    change to the corresponding path

How to generate multiple paraphrases for one input

You can modify generate.py or just run
python generate.py

GitHub

https://github.com/LHRYANG/CRL_EGPG