PDF4Cat Simple and Power tool for processing pdf docs using PyMuPDF


Keywords
django, pdf, pymupdf, python3, tool
License
Apache-2.0
Install
pip install PDF4Cat==0.5.0

Documentation

PDF4Cat

PDF4Cat Simple and Power tool for processing pdf docs using PyMuPDF

Documentation Status

Docs

Planing add

  • CLI
  • Async work & optimizations

PDF:

  • Merge
  • Split
  • Rotate
  • Edit Pages
  • Delete Pages and save to pdf(from pdf)
  • Extract Pages and save to pdf(from pdf)
  • Protect (Encrypt)
  • Unlock (Decrypt)
  • Compress (Flate)

Other things:

  • OCR pdf
  • Pdf to Images
  • Images to pdf

Add actions with docs:

  • DOCX
  • POWER POINT
  • OPEN OFFICE DOCS

Note: before use OCR run:

Install Tesseract.

Locate Tesseract’s language support folder. Typically you will find it here:

Windows: C:\Program Files\Tesseract-OCR\tessdata

Unix systems: /usr/share/tesseract-ocr/4.00/tessdata

Set the environment variable TESSDATA_PREFIX

Windows: set TESSDATA_PREFIX=C:\Program Files\Tesseract-OCR\tessdata

Unix systems: export TESSDATA_PREFIX=/usr/share/tesseract-ocr/4.00/tessdata