IC-GAN: Instance-Conditioned GAN
Official Pytorch code of Instance-Conditioned GAN by Arantxa Casanova, Marlene Careil, Jakob Verbeek, Michał Drożdżal, Adriana Romero-Soriano.
Generate images with IC-GAN in a Colab Notebook
We provide a Google Colab notebook to generate images with IC-GAN and its class-conditional counter part.
The figure below depicts two instances, unseen during training and downloaded from Creative Commons search, and the generated images with IC-GAN and class-conditional IC-GAN when conditioning on the class “castle”:
Additionally, and inspired by this Colab, we provide the funcionality in the same Colab notebook to guide generations with text captions, using the CLIP model. As an example, the following Figure shows three instance conditionings and a text caption (top), followed by the resulting generated images with IC-GAN (bottom), when optimizing the noise vector following CLIP’s gradient for 100 iterations.
Credit for the three instance conditionings, from left to right, that were modified with a resize and central crop: 1: “Landscape in Bavaria” by shining.darkness, licensed under CC BY 2.0, 2: “Fantasy Landscape – slolsss” by Douglas Tofoli is marked with CC PDM 1.0, 3: “How to Draw Landscapes Simply” by Kuwagata Keisai is marked with CC0 1.0
Requirements
- Python 3.8
- Cuda v10.2 / Cudnn v7.6.5
- gcc v7.3.0
- Pytorch 1.8.0
- A conda environment can be created from
environment.yaml
by entering the command:conda env create -f environment.yml
, that contains the aforemention version of Pytorch and other required packages. - Faiss: follow the instructions in the original repository.
Overview
This repository consists of four main folders:
data_utils
: A common folder to obtain and format the data needed to train and test IC-GAN, agnostic of the specific backbone.inference
: Scripts to test the models both qualitatively and quantitatively.BigGAN_PyTorch
: It provides the training, evaluation and sampling scripts for IC-GAN with a BigGAN backbone. The code base comes from Pytorch BigGAN repository, made available under the MIT License. It has been modified to add additional utilities and it enables IC-GAN training on top of it.stylegan2_ada_pytorch
: It provides the training, evaluation and sampling scripts for IC-GAN with a StyleGAN2 backbone. The code base comes from StyleGAN2 Pytorch, made available under the Nvidia Source Code License. It has been modified to add additional utilities and it enables IC-GAN training on top of it.
(Python script) Generate images with IC-GAN
Alternatively, we can generate images with IC-GAN models directly from a python script, by following the next steps:
- Download the desired pretrained models (links below) and the pre-computed 1000 instance features from ImageNet and extract them into a folder
pretrained_models_path
.
model | backbone | class-conditional? | training dataset | resolution | url |
---|---|---|---|---|---|
IC-GAN | BigGAN | No | ImageNet | 256×256 | model |
IC-GAN (half capacity) | BigGAN | No | ImageNet | 256×256 | model |
IC-GAN | BigGAN | No | ImageNet | 128×128 | model |
IC-GAN | BigGAN | No | ImageNet | 64×64 | model |
IC-GAN | BigGAN | Yes | ImageNet | 256×256 | model |
IC-GAN (half capacity) | BigGAN | Yes | ImageNet | 256×256 | model |
IC-GAN | BigGAN | Yes | ImageNet | 128×128 | model |
IC-GAN | BigGAN | Yes | ImageNet | 64×64 | model |
IC-GAN | BigGAN | Yes | ImageNet-LT | 256×256 | model |
IC-GAN | BigGAN | Yes | ImageNet-LT | 128×128 | model |
IC-GAN | BigGAN | Yes | ImageNet-LT | 64×64 | model |
IC-GAN | BigGAN | No | COCO-Stuff | 256×256 | model |
IC-GAN | BigGAN | No | COCO-Stuff | 128×128 | model |
IC-GAN | StyleGAN2 | No | COCO-Stuff | 256×256 | model |
IC-GAN | StyleGAN2 | No | COCO-Stuff | 128×128 | model |
- Execute:
python inference/generate_images.py --root_path [pretrained_models_path] --model [model] --model_backbone [backbone] --resolution [res]
model
can be chosen from["icgan", "cc_icgan"]
to use the IC-GAN or the class-conditional IC-GAN model respectively.backbone
can be chosen from["biggan", "stylegan2"]
.res
indicates the resolution at which the model has been trained. For ImageNet, choose one in[64, 128, 256]
, and for COCO-Stuff, one in[128, 256]
.
This script results in a .PNG file where several generated images are shown, given an instance feature (each row), and a sampled noise vector (each grid position).
Additional and optional parameters:
index
: (None by default), is an integer from 0 to 999 that choses a specific instance feature vector out of the 1000 instances that have been selected with k-means on the ImageNet dataset and stored inpretrained_models_path/stored_instances
.swap_target
: (None by default) is an integer from 0 to 999 indicating an ImageNet class label. This label will be used to condition the class-conditional IC-GAN, regardless of which instance features are being used.which_dataset
: (ImageNet by default) can be chosen from["imagenet", "coco"]
to indicate which dataset (training split) to sample the instances from.trained_dataset
: (ImageNet by default) can be chosen from["imagenet", "coco"]
to indicate the dataset in which the IC-GAN model has been trained on.num_imgs_gen
: (5 by default), it changes the number of noise vectors to sample per conditioning. Increasing this number results in a bigger .PNG file to save and load.num_conditionings_gen
: (5 by default), it changes the number of conditionings to sample. Increasing this number results in a bigger .PNG file to save and load.z_var
: (1.0 by default) controls the truncation factor for the generation.- Optionally, the script can be run with the following additional options
--visualize_instance_images --dataset_path [dataset_path]
to visualize the ground-truth images corresponding to the conditioning instance features, given a path to the dataset’s ground-truth imagesdataset_path
. Ground-truth instances will be plotted as the leftmost image for each row.
Data preparation
ImageNet
- Download dataset from here .
- Download SwAV feature extractor weights from here .
- Replace the paths in data_utils/prepare_data.sh:
out_path
by the path where hdf5 files will be stored,path_imnet
by the path where ImageNet dataset is downloaded, andpath_swav
by the path where SwAV weights are stored. - Execute
./data_utils/prepare_data.sh imagenet [resolution]
, where[resolution]
can be an integer in {64,128,256}. This script will create several hdf5 files:-
ILSVRC[resolution]_xy.hdf5
andILSVRC[resolution]_val_xy.hdf5
, where images and labels are stored for the training and validation set respectively. -
ILSVRC[resolution]_feats_[feature_extractor]_resnet50.hdf5
that contains the instance features for each image. -
ILSVRC[resolution]_feats_[feature_extractor]_resnet50_nn_k[k_nn].hdf5
that contains the list of [k_nn] neighbors for each of the instance features.
-
ImageNet-LT
- Download ImageNet dataset from here . Following ImageNet-LT , the file
ImageNet_LT_train.txt
can be downloaded from this link and later stored in the folder./BigGAN_PyTorch/imagenet_lt
. - Download the pre-trained weights of the ResNet on ImageNet-LT from this link, provided by the classifier-balancing repository .
- Replace the paths in data_utils/prepare_data.sh:
out_path
by the path where hdf5 files will be stored,path_imnet
by the path where ImageNet dataset is downloaded, andpath_classifier_lt
by the path where the pre-trained ResNet50 weights are stored. - Execute
./data_utils/prepare_data.sh imagenet_lt [resolution]
, where[resolution]
can be an integer in {64,128,256}. This script will create several hdf5 files:-
ILSVRC[resolution]longtail_xy.hdf5
, where images and labels are stored for the training and validation set respectively. -
ILSVRC[resolution]longtail_feats_[feature_extractor]_resnet50.hdf5
that contains the instance features for each image. -
ILSVRC[resolution]longtail_feats_[feature_extractor]_resnet50_nn_k[k_nn].hdf5
that contains the list of [k_nn] neighbors for each of the instance features.
-
COCO-Stuff
- Download the dataset following the LostGANs’ repository instructions .
- Download SwAV feature extractor weights from here .
- Replace the paths in data_utils/prepare_data.sh:
out_path
by the path where hdf5 files will be stored,path_imnet
by the path where ImageNet dataset is downloaded, andpath_swav
by the path where SwAV weights are stored. - Execute
./data_utils/prepare_data.sh coco [resolution]
, where[resolution]
can be an integer in {128,256}. This script will create several hdf5 files:-
COCO[resolution]_xy.hdf5
andCOCO[resolution]_val_test_xy.hdf5
, where images and labels are stored for the training and evaluation set respectively. -
COCO[resolution]_feats_[feature_extractor]_resnet50.hdf5
that contains the instance features for each image. -
COCO[resolution]_feats_[feature_extractor]_resnet50_nn_k[k_nn].hdf5
that contains the list of [k_nn] neighbors for each of the instance features.
-
Other datasets
- Download the corresponding dataset and store in a folder
dataset_path
. - Download SwAV feature extractor weights from here .
- Replace the paths in data_utils/prepare_data.sh:
out_path
by the path where hdf5 files will be stored andpath_swav
by the path where SwAV weights are stored. - Execute
./data_utils/prepare_data.sh [dataset_name] [resolution] [dataset_path]
, where[dataset_name]
will be the dataset name,[resolution]
can be an integer, for example 128 or 256, anddataset_path
contains the dataset images. This script will create several hdf5 files:-
[dataset_name][resolution]_xy.hdf5
, where images and labels are stored for the training set. -
[dataset_name][resolution]_feats_[feature_extractor]_resnet50.hdf5
that contains the instance features for each image. -
[dataset_name][resolution]_feats_[feature_extractor]_resnet50_nn_k[k_nn].hdf5
that contains the list ofk_nn
neighbors for each of the instance features.
-
How to subsample an instance feature dataset with k-means
To downsample the instance feature vector dataset, after we have prepared the data, we can use the k-means algorithm: python data_utils/store_kmeans_indexes.py --resolution [resolution] --which_dataset [dataset_name] --data_root [data_path]
- Adding
--gpu
allows the faiss library to compute k-means leveraging GPUs, resulting in faster execution. - Adding the parameter
--feature_extractor [feature_extractor]
chooses which feature extractor to use, withfeature_extractor
in['selfsupervised', 'classification']
, if we are using swAV as feature extactor or the ResNet pretrained on the classification task on ImageNet, respectively. - The number of k-means clusters can be set with
--kmeans_subsampled [centers]
, wherecenters
is an integer.
How to train the models
BigGAN or StyleGAN2 backbone
Training parameters are stored in JSON files in [backbone_folder]/config_files/[dataset]/*.json
, where [backbone_folder]
is either BigGAN_Pytorch or stylegan2_ada_pytorch and [dataset]
can either be ImageNet, ImageNet-LT or COCO_Stuff.
<div class="snippet-clipboard-content position-relative" data-snippet-clipboard-copy-content="cd BigGAN_PyTorch
python run.py –json_config config_files//.json –data_root [data_root] –base_root [base_root]
“>
cd BigGAN_PyTorch
python run.py --json_config config_files/
/
.json --data_root [data_root] --base_root [base_root]