An implemention of offline RL on recommender system

@author: misajie
@update: 20220123

File organization:

  • RecEnv
  • ClassicalRL
  • OfflineRL

In progress:

  • Classical off-policy models construction and application on existing environments (Recsim, Virtual Taobao)
  • Reconstruct simulator-free model, eg. feedrec
  • Modify Recsim to fit Wechat short video dataset and run off-policy models and evaluate the result
  • Generate reply samples from short video recommendation environment
  • Build classical offline models
  • Build original offline model
  • Evaluate new model
  • add autoML

GitHub

View Github