Sheet Data Image/PDF-to-CSV Converter

Dependencies

pip install pdf2image
pip install opencv-python
pip install pytesseract

This tool also require Pillow library, this could install through conda:

conda install -c conda-forge pillow

Usage

python spreadsheet_parser.py [input-file] [output-file] [language]

input-file: Input file, must be pdf file or image file
output-file: Output path for csv output
language: content language (must be pre-install tessdata model for that language)

Example:

python spreadsheet_parser.py examples/test.png output.csv

GitHub

View Github