/ Machine Learning

End-to-end View Synthesis from a Single Image

End-to-end View Synthesis from a Single Image


This is the code for the CVPR 2020 paper. This code allows for synthesising of new views of a scene given a single image of an unseen scene at test time. It is trained with pairs of views in a self-supervised fashion. It is trained end to end, using GAN techniques and a new differentiable point cloud renderer. At test time, a single image of an unseen scene is input to the model from which new views are generated.

Fig 1: Generated images at new viewpoints using SynSin. Given the first image in the video, the model generates all subsequent images along the trajectory. The same model is used for all reconstructions. The scenes were not seen at train time.


Note that this repository is a large refactoring of the original code to allow for public release
and to integrate with pytorch3d.
Hence the models/datasets are not necessarily the same as that in the paper, as we cannot release
the saved test images we used.
To compare results, we recommend comparing against the numbers and models in this repo for fair comparison
and reproducibility.

Setup and Installation



To quickly start using a pretrained model, see Quickstart.

Training and evaluating your own model

To download, train, or evaluate a model on a given dataset, please read the appropriate file.
(Note that we cannot distribute the raw pixels, so we have explained how we downloaded and organised the datasets in the appropriate file.)


If this work is helpful in your research. Please cite:
  author =       {Olivia Wiles and Georgia Gkioxari and Richard Szeliski and 
                  Justin Johnson},
  title =        {{SynSin}: {E}nd-to-end View Synthesis from a Single Image},
  booktitle =      {CVPR},
  year =         {2020}