Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Requirements

  • pytorch 1.1+
  • torchvision 0.3+
  • pyclipper
  • opencv3
  • gcc 4.9+

Download

PAN_resnet18_FPEM_FFM and PAN_resnet18_FPEM_FFM on icdar2015:

the updated model(resnet18:78.8,shufflenetv2: 72.4,lr:le-3) is not the best model

google drive

Data Preparation

train: prepare a text in the following format, use '\t' as a separator

/path/to/img.jpg path/to/label.txt ...

val: use a folder

img/ store img gt/ store gt file

Train

  1. config the train_data_path,val_data_pathin config.json
  2. use following script to run

python3 train.py

Test

eval.py is used to test model on test dataset

  1. config model_path, img_path, gt_path, save_path in eval.py
  2. use following script to test

python3 eval.py

Predict

predict.py is used to inference on single image

  1. config model_path, img_path, in predict.py
  2. use following script to predict

python3 predict.py

The project is still under development.

Performance

ICDAR 2015

only train on ICDAR2015 dataset

Methodimage size (short size)learning ratePrecision (%)Recall (%)F-measure (%)FPS
paper(resnet18)736xxx80.426.1
my (ShuffleNetV2+FPEM_FFM+pse扩张)7361e-381.7266.7373.4724.71 (P100)
my (resnet18+FPEM_FFM+pse扩张)7361e-384.9374.0979.1421.31 (P100)
my (resnet50+FPEM_FFM+pse扩张)7361e-384.2376.1279.9614.22 (P100)
my (ShuffleNetV2+FPEM_FFM+pse扩张)7361e-475.1457.3465.0424.71 (P100)
my (resnet18+FPEM_FFM+pse扩张)7361e-483.8969.2375.8621.31 (P100)
my (resnet50+FPEM_FFM+pse扩张)7361e-485.2975.179.8714.22 (P100)
my (resnet18+FPN+pse扩张)7361e-376.5074.7075.5914.47 (P100)
my (resnet50+FPN+pse扩张)7361e-371.8275.7373.7210.67 (P100)
my (resnet18+FPN+pse扩张)7361e-474.1972.3473.2514.47 (P100)
my (resnet50+FPN+pse扩张)7361e-478.9676.2777.5910.67 (P100)

examples

todo

  • MobileNet backbone
  • ShuffleNet backbone

reference

  1. https://arxiv.org/pdf/1908.05900.pdf
  2. https://github.com/WenmuZhou/PSENet.pytorch

If this repository helps you,please star it. Thanks.

GitHub - WenmuZhou/PAN.pytorch: A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network - GitHub - WenmuZhou/PAN.pytorch: A unofficial pytorch imp...