Human Performance Capture from Monocular Video in the Wild

Paper | Video | Project Page

Official code release for 3DV 2021 paper Human Performance Capture from Monocular Video in the Wild. We propose a method capable of capturing the dynamic 3D human shape from a monocular video featuring challenging body poses, without any additional input.

If you find our code or paper useful, please cite as

  title={Human Performance Capture from Monocular Video in the Wild},
  author={Guo, Chen and Chen, Xu and Song, Jie and Hilliges, Otmar},
  booktitle={2021 International Conference on 3D Vision (3DV)},

Quick Start

CLone this repo:

git clone
cd  hpcwild
conda env create -f environment.yml
conda activate hpcwild

Additional Dependencies:

  1. Kaolin 0.1.0 (
  2. MPI mesh library (
  3. torch-mesh-isect (

Download SMPL models (1.0.0 for Python 2.7 (10 shape PCs)) and move them to the corresponding places:

mkdir lib/smpl/smpl_model/
mv /path/to/smpl/models/basicModel_f_lbs_10_207_0_v1.0.0.pkl smpl_rendering/smpl_model/SMPL_FEMALE.pkl
mv /path/to/smpl/models/basicmodel_m_lbs_10_207_0_v1.0.0.pkl smpl_rendering/smpl_model/SMPL_MALE.pkl

Download checkpoints for external modules:

mv /path/to/checkpoint_iter_370000.pth external/lightweight-human-pose-estimation.pytorch/checkpoint_iter_370000.pth

mv /path/to/ external/pifuhd/checkpoints/

Download IPNet weights:
mv /path/to/IPNet_p5000_01_exp_id01 registration/experiments/IPNet_p5000_01_exp_id01

gdown --id 1mcr7ALciuAsHCpLnrtG_eop5-EYhbCmz -O modnet_photographic_portrait_matting.ckpt
mv /path/to/modnet_photographic_portrait_matting.ckpt external/MODNet/pretrained/modnet_photographic_portrait_matting.ckpt

Test on 3DPW dataset

Download 3DPW dataset

  1. modify the dataset_path in test.conf.
  2. run bash to obtain the rigid body shape.
  3. run bash to register a SMPL+D model to the rigid human body.
  4. run bash to capture the human performance temporally.

Test on your own video

  1. run OpenPose to obtain the 2D keypoints.
  2. run LGD to acquire the initial 3D poses.
  3. run MODNet to extract sihouettes.


We use the code in PIFuHD for the rigid body construction and adapt IPNet for human model registration. We use off-the-shelf methods OpenPose and MODNet for the extraction of 2D keypoints and sihouettes. We sincerely thank these authors for their awesome work.