We propose a method for converting a single RGB-D input image into a 3D photo, i.e., a multi-layer representation for novel view synthesis that contains hallucinated color and depth structures in regions occluded in the original view. We use a Layered Depth Image with explicit pixel connectivity as underlying representation, and present a learning-based inpainting model that iteratively synthesizes new local color-and-depth content into the occluded region in a spatial context-aware manner. The resulting 3D photos can be efficiently rendered with motion parallax using standard graphics engines. We validate the effectiveness of our method on a wide range of challenging everyday scenes and show fewer artifacts when compared with the state-of-the-arts.


  • Linux (tested on Ubuntu 18.04.4 LTS)
  • Anaconda
  • Python 3.7 (tested on 3.7.4)
  • PyTorch 1.4.0 (tested on 1.4.0 for execution)

and the Python dependencies listed in requirements.txt

  • To get started, please run the following commands:
    conda create -n 3DP python=3.7 anaconda
    conda activate 3DP
    pip install -r requirements.txt
    conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
  • Next, please download the model weight using the following command:
    chmod +x download.sh

Quick start

Please follow the instructions in this section.
This should allow to execute our results.
For more detailed instructions, please refer to DOCUMENTATION.md.


  1. Put .jpg files (e.g., test.jpg) into the image folder.
    • E.g., image/moon.jpg
  2. Run the following command
    python demo.py --config argument.yml
    • Note: The 3D photo generation process usually takes about 2-3 minutes depending on the available computing resources.
  3. The results are stored in the following directories:
    • Corresponding depth map estimated by MiDaS
      • E.g. depth/moon.npy
    • Inpainted 3D mesh
      • E.g. mesh/moon.ply
    • Rendered videos with straight-line motion
      • E.g. mesh/moon_straight-line.mp4
    • Rendered videos with swing motion
      • E.g. mesh/moon_swing.mp4
  4. (Optional) If you want to change the default configuration. Please read DOCUMENTATION.md and modified argument.yml.


This work is licensed under MIT License. See LICENSE for details.

If you find our code/models useful, please consider citing our paper:

  author = {Shih, Meng-Li and Su, Shih-Yang and Kopf, Johannes and Huang, Jia-Bin},
  title = {3D Photography using Context-aware Layered Depth Inpainting},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2020}