Meta-Learning for Video Frame Interpolation

meta-interpolation

Myungsub Choi, Janghoon Choi, Sungyong Baik, Tae Hyun Kim, Kyoung Mu Lee
Source code for CVPR 2020 paper "Scene-Adaptive Video Frame Interpolation via Meta-Learning"

Requirements

Ubuntu 18.04
Python==3.7
numpy==1.18.1
PyTorch1.4.0, cudatoolkit10.1
opencv==3.4.2
cupy==7.3 (recommended: conda install cupy -c conda-forge)
tqdm==4.44.1

For [DAIN], the environment is different; please check dain/dain_env.yml for the requirements.

Usage

Disclaimer : This code is re-organized to run multiple different models in this single codebase. Due to a lot of version and env changes, the numbers obtained from this code may be different (usually better) from those reported in the paper. The original code modifies the main training scripts for each frame interpolation github repo ([DVF (voxelflow)], [SuperSloMo], [SepConv], [DAIN]), and are put in ./legacy/*.py. If you want to exactly reproduce the numbers reported in our paper, please contact @myungsub for legacy experimental settings.

Dataset Preparation

We use [ Vimeo90K Septuplet dataset ] for training + testing
- After downloading the full dataset, make symbolic links in data/ folder:
  - ln -s /path/to/vimeo_septuplet_data/ ./data/vimeo_septuplet
For further evaluation, use:
- [ Middlebury-OTHERS dataset ] - download other-color-allframes.zip and other-gt-interp.zip
- [ HD dataset ] - download the original ground truth videos [here]

Frame Interpolation Model Preparation

Download pretrained models from [Here], and save them to ./pretrained_models/*.pth

Training / Testing with Vimeo90K-Septuplet dataset

For training, simply run: ./scripts/run_{VFI_MODEL_NAME}.sh
- Currently supports: sepconv, voxelflow, superslomo, cain, and rrin
- Other models are coming soon!
For testing, just uncomment two lines containing: --mode val and --pretrained_model {MODEL_NAME}

Testing with custom data

See scripts/run_test.sh for details:
Things to change:
- Modify the folder directory containing the video frames by changing --data_root to your desired dir/
- Make sure to match the image format --img_fmt (defaults to png)
- Change --model, --loss, and --pretrained_models to what you want
  - For SepConv, --model should be sepconv, and --loss should be 1*L1
  - For VoxelFlow, --model should be voxelflow, and --loss should be 1*MSE
  - For SuperSloMo, --model should be superslomo, --loss should be 1*Super
  - For DAIN, --model should be dain, and --loss should be 1*L1
  - For CAIN, --model should be cain, and --loss should be 1*L1
  - For RRIN, '--model should be rrin, and --loss should be 1*L1

Using Other Meta-Learning Algorithms

Current code supports using more advanced meta-learning algorithms compared to vanilla MAML, e.g. MAML++, L2F, or Meta-SGD.
- For MAML++ you can explore many different hyperparameters by adding additional options (see config.py)
- For L2F, just uncomment --attenuate in scripts/run_{VFI_MODEL_NAME}.sh
- For Meta-SGD, just uncomment --metasgd (This usually results in the best performance!)

Framework Overview

fig_1

Results

Qualitative results for VimeoSeptuplet dataset

fig_qual

Qualitative results for Middlebury-OTHERS dataset

fig_qual_supp_middlebury

Qualitative results for HD dataset

fig_qual_supp_hd

Additional Results Video

![thumb](/content/images/2020/11/thumb.png)

Citation

If you find this code useful for your research, please consider citing the following paper:

@inproceedings{choi2020meta,
    author = {Choi, Myungsub and Choi, Janghoon and Baik, Sungyong and Kim, Tae Hyun and Lee, Kyoung Mu},
    title = {Scene-Adaptive Video Frame Interpolation via Meta-Learning},
    booktitle = {CVPR},
    year = {2020}
}