PDFSegmenter

This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.


Keywords
pdf, document-processing, python, page-segmentation, layout-analysis, cluster-analysis, annotations, csv, table, detection-model
License
MIT
Install
pip install PDFSegmenter==0.1