EGVSR-PyTorch
This is a PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR), using subpixel convolution to optimize the inference speed of TecoGAN VSR model. Please refer to the official implementation ESPCN and TecoGAN for more information.
Features
- Unified Framework: This repo provides a unified framework for various state-of-the-art DL-based VSR methods, such as VESPCN, SOFVSR, FRVSR, TecoGAN and our EGVSR.
- Multiple Test Datasets: This repo offers three types of video datasets for testing, i.e., standard test dataset -- Vid4, Tos3 used in TecoGAN and our new dataset -- Gvt72 (selected from Vimeo site and including more scenes).
- Better Performance: This repo provides model with faster inferencing speed and better overall performance than prior methods. See more details in Benchmarks section.
Dependencies
- Ubuntu >= 16.04
- NVIDIA GPU + CUDA & CUDNN
- Python 3
- PyTorch >= 1.0.0
- Python packages: numpy, matplotlib, opencv-python, pyyaml, lmdb (requirements.txt & req.txt)
- (Optional) Matlab >= R2016b
Datasets
A. Training Dataset
Download the official training dataset based on the instructions in TecoGAN-TensorFlow, rename to VimeoTecoGAN
and then place under ./data
.
B. Testing Datasets
- Vid4 -- Four video sequences: city, calendar, foliage and walk;
- Tos3 -- Three video sequences: bridge, face and room;
- Gvt72 -- Generic VSR Test Dataset: 72 video sequences (including natural scenery, culture scenery, streetscape scene, life record, sports photography, etc, as shown below)
You can get them at :arrow_double_down: 百度网盘 (提取码:8tqc) and put them into :file_folder: Datasets.
The following shows the structure of the above three datasets.
data
├─ Vid4
├─ GT # Ground-Truth (GT) video sequences
└─ calendar
├─ 0001.png
└─ ...
├─ Gaussian4xLR # Low Resolution (LR) video sequences in gaussian degradation and x4 down-sampling
└─ calendar
├─ 0001.png
└─ ...
└─ ToS3
├─ GT
└─ Gaussian4xLR
└─ Gvt72
├─ GT
└─ Gaussian4xLR
Benchmarks
Experimental Environment
Version | Info. | |
---|---|---|
System | Ubuntu 18.04.5 LTS | X86_64 |
CPU | Intel i9-9900 | 3.10GHz |
GPU | Nvidia RTX 2080Ti | 11GB GDDR6 |
Memory | DDR4 2666 | 32GB×2 |
A. Test on Vid4 Dataset
1.LR 2.VESPCN 3.SOFVSR 4.DUF 5.Ours:EGVSR 6.GT
Objective metrics for visual quality evaluation[1]
B. Test on Tos3 Dataset
1.VESPCN 2.SOFVSR 3. FRVSR 4.TecoGAN 5.Ours:EGVSR 6.GT
C. Test on Gvt72 Dataset
1.LR 2.VESPCN 3.SOFVSR 4.DUF 5.Ours:EGVSR 6.GT
Objective metrics for visual quality and temporal coherence evaluation[1]
D. Optical-Flow based Motion Compensation
Please refer to FLOW_walk, FLOW_foliage and FLOW_city.
E. Comprehensive Performance
Comparison of various SOTA VSR model on video quality score and speed performance[3]
[1] :arrow_down::smaller value for better performance, :arrow_up:: on the contrary; Red: stands for Top1, Blue: Top2.
[2] The calculation formula of video quality score considering both spatial and temporal domain, using lambda1=lambda2=lambda3=1/3.
[3] FLOPs & speed are computed on RGB with resolution 960x540 to 3840x2160 (4K) on NVIDIA GeForce GTX 2080Ti GPU.