SDL: Synthetic Document Layout dataset

SDL is the project that synthesizes document images. It facilitates multiple-level labeling on document images and can generate in multiple languages.

Sample image

see

Structure of data

structure

Quick start

python flexible_layout.py --config_file configs/page.yaml

Instruction to run data generation

Go to instruction

Visualization of the result

python data_manipulation/visualize.py

Release soon

Paper

https://arxiv.org/abs/2106.15117

GitHub

https://github.com/tson1997/SDL-Document-Image-Generation