GitHub / Envinorma / pdf_ocr_app
A ready-to-deploy Dash application for parsing PDF files with Tesseract
JSON API: https://repos.data.code.gouv.fr/api/v1/hosts/GitHub/repositories/Envinorma%2Fpdf_ocr_app
Stars: 4
Forks: 2
Open issues: 0
License: mit
Language: Python
Size: 403 KB
Dependencies parsed at:
30
Created at: about 4 years ago
Updated at: 4 months ago
Pushed at: almost 4 years ago
Last synced at: 1 day ago
- actions/checkout v1 composite
- actions/setup-python v1 composite
- codecov/codecov-action v1 composite
requirements-dev.txt
pypi
- black ==20.8b1 development
- codecov >=2.1.4 development
- flake8 ==3.8.4 development
- ipython ==7.19.0 development
- mypy ==0.800 development
- pre-commit ==2.9.3 development
- pylint ==2.6.0 development
- pytest ==6.2.1 development
- pytest-cov >=2.9.0 development
- pytest-mypy ==0.8.0 development
- pytest-raises >=0.11 development
- pytest-runner >=5.2 development
requirements.txt
pypi
- Unidecode ==1.0.23
- alto-xml ==0.0.3
- beautifulsoup4 ==4.8.2
- dash ==1.17.0
- dash-bootstrap-components ==0.11.1
- gunicorn ==20.0.4
- lxml ==4.6.2
- opencv-python-headless ==4.5.1.48
- pdf2image ==1.14.0
- pytesseract ==0.3.7
- requests ==2.25.1
- requests-oauthlib ==1.3.0
- scipy ==1.6.1
- tesseract-ocr-utils ==0.0.4
- textdistance ==4.2.1