PyTorch implementation of Constrained Policy Optimization (CPO)
This repository has a simple to understand and use implementation of CPO in PyTorch. A dummy constraint function is included and can be adapted based on your needs.
- PyTorch (The code is tested on PyTorch 1.2.0.)
- OpenAI Gym.
- MuJoCo (mujoco-py)
- If working with a GPU, set OMP_NUM_THREADS to 1 using:
- Tensorboard integration to track learning.
- Best model is tracked and saved using the value and standard deviation of average reward.
- python algos/main.py –env-name CartPole-v1 –algo-name=CPO –exp-num=1 –exp-name=CPO/CartPole –save-intermediate-model=10 –gpu-index=0 –max-iter=500