MONet: Unsupervised Scene Decomposition and Representation

MONet in PyTorch

We provide a PyTorch implementation of MONet.

This project is built on top of the CycleGAN/pix2pix code written by Jun-Yan Zhu and Taesung Park, and supported by Tongzhou Wang.

Note: The implementation is developed and tested on Python 3.7 and PyTorch 1.1.

Implementation details

Decoder Negative Log-Likelihood (NLL) loss

$\mathcal{L}(\theta, x) = -\sum_{n=1}^N \log \sum_{k=1}^K \exp{\bigg(\log{\dfrac{m_k}{\sqrt{\sigma_k^2}}} - \dfrac{(x_n - \mu_\theta(z_k))^2}{2\sigma_k^2} \bigg)}$ where *N* is the number of pixels in the image, and *K* is the number of mixture components.

Test Results

CLEVR 64x64 @ 160 epochs

Prerequisites

Linux or macOS (not tested)
Python 3.7
CPU or NVIDIA GPU + CUDA 10 + CuDNN

Getting Started

Installation

Clone this repo:

git clone https://github.com/baudm/MONet-pytorch.git
cd MONet-pytorch

Install [PyTorch](http://pytorch.org and) 1.1+ and other dependencies (e.g., torchvision, visdom and dominate).
- For pip users, please type the command pip install -r requirements.txt.
- For Conda users, we provide a installation script ./scripts/conda_deps.sh. Alternatively, you can create a new Conda environment using conda env create -f environment.yml.
- For Docker users, we provide the pre-built Docker image and Dockerfile. Please refer to our Docker page.

MONet train/test

Download a MONet dataset (e.g. CLEVR):

wget -cN https://dl.fbaipublicfiles.com/clevr/CLEVR_v1.0.zip

To view training results and loss plots, run python -m visdom.server and click the URL http://localhost:8097.
Train a model:

python train.py --dataroot ./datasets/CLEVR_v1.0 --name clevr_monet --model monet

To see more intermediate results, check out ./checkpoints/clevr_monet/web/index.html.

To generate a montage of the model outputs like the ones shown above:

./scripts/test_monet.sh
./scripts/generate_monet_montage.sh

Apply a pre-trained model

Download pretrained weights for CLEVR 64x64:

./scripts/download_monet_model.sh clevr

MONet: Unsupervised Scene Decomposition and Representation

MONet in PyTorch

Implementation details

Decoder Negative Log-Likelihood (NLL) loss

Test Results

CLEVR 64x64 @ 160 epochs

Prerequisites

Getting Started

Installation

MONet train/test

Apply a pre-trained model

GitHub

John

Efficient Channel Attention for Deep Convolutional Neural Networks

A deep learning utility library for visualization and sensor fusion purpose

MONet in PyTorch

Implementation details

Decoder Negative Log-Likelihood (NLL) loss

Test Results

CLEVR 64x64 @ 160 epochs

Prerequisites

Getting Started

Installation

MONet train/test

Apply a pre-trained model

GitHub

Efficient Channel Attention for Deep Convolutional Neural Networks

A deep learning utility library for visualization and sensor fusion purpose

You might also like...