GitHub topics: extraction
aphp/edspdf
EDS-PDF is a generic, pure-Python framework for text extraction from PDF documents. It provides the machinery to use rule- or machine-learning-based approaches to classify text blocs between body and meta-data.
Language: Python - Size: 8.93 MB - Last synced at: about 21 hours ago - Pushed at: 6 months ago - Stars: 51 - Forks: 7

Related Keywords