PyTorch implementation of Constrained Policy Optimization

Oct 20, 2021 1 min read

PyTorch implementation of Constrained Policy Optimization (CPO)

This repository has a simple to understand and use implementation of CPO in PyTorch. A dummy constraint function is included and can be adapted based on your needs.

Pre-requisites

PyTorch (The code is tested on PyTorch 1.2.0.)
OpenAI Gym.
MuJoCo (mujoco-py)
If working with a GPU, set OMP_NUM_THREADS to 1 using:

export OMP_NUM_THREADS=1

Features

Tensorboard integration to track learning.
Best model is tracked and saved using the value and standard deviation of average reward.

Usage

python algos/main.py –env-name CartPole-v1 –algo-name=CPO –exp-num=1 –exp-name=CPO/CartPole –save-intermediate-model=10 –gpu-index=0 –max-iter=500

Code Reference

Khrylx/PyTorch-RL

Technical Details on CPO

GitHub

View Github

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.