PubLayNet is a large dataset of document images, of which the layout is annotated with both bounding boxes and polygonal segmentations.
15/Sept/2020 - Add training code.
29/Feb/2020 - Add benchmarking for
22/Feb/2020 - Pre-trained Mask-RCNN model in (Pytorch) are released .
|Architecture||Iter num (x16)||AP||AP50||AP75||AP Small||AP Medium||AP Large||MD5SUM|
Download trained weights in Benchmarking section above, locate it in maskrcnn directory
cd maskrcnn python infer.py --image_path = "document_image_dir/image.jpg" --model_path = "mrcnn_model_dir/model.pth" --output_path="model_segmentation_output_dir/"
Avarage Precision in validation stages (via Tensorboard)
Please take a look at
training_code dir. Sorry for the dirty code but I really don't have time to refactor it :D