tess_py_api
Python wrapper around Tesseract-OCR C language API using Python built-in ctypes module.
tess_py_api is much faster than pytesseract when dealing with more than one image and lets Python developers use all of the Tesseract library functionality using the auto-generated Python base wrapper.
INSTALLATION
me@home:~$ pip install tess_py_api
To use this module you need to have Tesseract installed.
For Windows users, Tesseract-OCR executable can be downloaded from here (note that you will need to add Tesseract to your path).
For Linux users, you can compile Tesseract from source or get it via your package manager (note that you will need to add an enviroment variable called TESSDATA_PREFIX pointing to your tessdata directory).
USAGE
from tess_py_api import Pyapi
api = Pyapi("eng")
image_path = "/path/to/image"
print(api.image_to_string(image_path))
if you have OpenCV installed you can use it:
import cv2 as cv
from tess_py_api import Pyapi
api = Pyapi("eng")
image = cv.imread("path/to/image")
print(api.image_to_string(image))
or you can use python built-in PIL/pillow:
from PIL import Image
from tess_py_api import Pyapi
api = Pyapi("eng")
image = Image.open("path/to/image")
print(api.image_to_string(image)
Python context manager is also supported:
from tess_py_api import Pyapi
with Pyapi("eng") as api:
print(api.image_to_string("/path/to/image"))
LICENSE
This repository is under the MIT license.
The repository wraps code under the Apache 2.0 license for Tesseract.