An interactive document scanner built in Python using OpenCV

Feb 13, 2022 2 min read

Document Scanner

An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive color threshold to clean up the image.

On my test dataset of 280 images, the program correctly detected the corners of the document 92.8% of the time.

This project makes use of the transform and imutils modules from pyimagesearch (which can be accessed here). The UI code for the interactive mode is adapted from poly_editor.py from here.

You can manually click and drag the corners of the document to be perspective transformed:
The scanner can also process an entire directory of images automatically and save the output in an output directory:

Here are some examples of images before and after scan:

Usage

python scan.py (--images <IMG_DIR> | --image <IMG_PATH>) [-i]

The -i flag enables interactive mode, where you will be prompted to click and drag the corners of the document. For example, to scan a single image with interactive mode enabled:

python scan.py --image sample_images/desk.JPG -i

Alternatively, to scan all images in a directory without any input:

python scan.py --images sample_images

John was the first writer to have joined pythonawesome.com. He has since then inculcated very effective writing and reviewing culture at pythonawesome which rivals have found impossible to imitate.