Transfer Learning for Text Classification with Tensorflow
Tensorflow implementation of Semi-supervised Sequence Learning(https://arxiv.org/abs/1511.01432).
Auto-encoder or language model is used as a pre-trained model to initialize LSTM text classification model.
- SA-LSTM: Use auto-encoder as a pre-trained model.
- LM-LSTM: Use language model as a pre-trained model.
- Python 3
- pip install -r requirements.txt
DBpedia dataset is used for pre-training and training.
Pre-train auto encoder or language model
$ python pre_train.py --model="<MODEL>"
(<Model>: auto_encoder | language_model)
Train LSTM text classification model
$ python train.py --pre_trained="<MODEL>"
(<Model>: none | auto_encoder | language_model)
- Orange lines: LSTM
- Blue lines: SA-LSTM
- Red lines: LM-LSTM