Real-time Scene Text Detection with Differentiable Binarization

Jan 20, 2020 1 min read

Real-time-Text-Detection

PyTorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization

Difference between thesis and this implementation

Use dice loss instead of BCE(binary cross-entropy) loss.
Use normal convolution rather than deformable convolution in the backbone network.
The architecture of the backbone network is a simple FPN.
Have not implement OHEM.
The ground truth of the threshold map is constant 1 rather than 'the distance to the closest segment'.

Introduction

thanks to these project:

https://github.com/WenmuZhou/PAN.pytorch

The features are summarized blow:

Use resnet18/resnet50/shufflenetV2 as backbone.

Installation

pytorch 1.1.0

Download

ShuffleNet_V2 Models trained on ICDAR 2013+2015 (training set)
https://pan.baidu.com/s/1Um0wzbTFjJC0jdJ703GR7Q

Train

modify genText.py to generate txt list file for training/testing data
modify config.json
run

python train.py

Predict

python predict.py

Eval

run

python eval.py

Examples

contour

bbox

Todo

[ ] MobileNet backbone
[ ] Deformable convolution
[ ] tensorboard support
[ ] FPN --> Architecture in the thesis
[ ] Dice Loss --> BCE Loss
[ ] threshold map gt use 1 --> threshold map gt use distance （Use 1 will accelerate the label generation）
[ ] OHEM
[ ] OpenCV_DNN inference API for CPU machine
[ ] Caffe version (for deploying with MNN/NCNN)
[ ] ICDAR13 / ICDAR15 / CTW1500 / MLT2017 / Total-Text

GitHub

Machine Learning

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.

Real-time Scene Text Detection with Differentiable Binarization

Real-time-Text-Detection

Difference between thesis and this implementation

Introduction

Installation

Download

Train

Predict

Eval

Examples

Todo

GitHub

John

An open-source software suite for processing human brain MRI

A command-line tool for loading different bash environment profiles

Real-time-Text-Detection

Difference between thesis and this implementation

Introduction

Installation

Download

Train

Predict

Eval

Examples

Todo

GitHub

An open-source software suite for processing human brain MRI

A command-line tool for loading different bash environment profiles

You might also like...