pagexml-slim

Wrapper for the PageXML C++ library to ease handling of Page XML files within python.


Keywords
annotation-processing, docker-image, document-representation, pagexml, python
License
MIT
Install
pip install pagexml-slim==2022.4.12

Documentation

Introduction

Library in C++ and a python wrapper for dealing with Page XML files

CircleCI

Requirements

Check py-pagexml/README.rst and/or docker/Dockerfile_build, docker/Dockerfile_runtime.

Contents

  • lib: Directory containing the C++ PageXML and TextFeatExtractor libraries.
  • py-pagexml: Swig-based python wrapper for the PageXML library.
  • py-textfeat: Swig-based python wrapper for the TextFeatExtractor library.

Documentation