docxreviews2txt

Command line tool to extract review changes from a docx file as plain text


Keywords
one, two, docx, docx-review-changes, google-docs, ms-word, ms-word-docx-file, plain-text, python, review-changes, review-changes-as-plan-text
License
MIT
Install
pip install docxreviews2txt==0.4.5

Documentation

docxreviews2txt

Command line tool to extract review changes from a docx file as plain text. It is useful when reviewing a PDF file as docx, and you need to share the changes as plain text.

How to install?

pip install docxreviews2txt

How to use it?

usage: docxreviews2txt [-h] [--version] docx

Command line tool to extract review changes from a docx file as plain text.

positional arguments:
  docx        input docx

options:
  -h, --help  show this help message and exit
  --version   show version

Example:

$ docxreviews2txt tests/lorem_ipsum.docx
txt reviews at file:///home/alan/src/docxreviews2txt/tests/lorem_ipsum_review.txt
$ cat /home/alan/src/docxreviews2txt/tests/lorem_ipsum_review.txt
# Typos suggestions (using HTML tags <ins> and <del>)
- dolor sit amet, consectetur <ins>Lorem ipsum</ins><del>adipiscing</del>
- sit amet, consectetur adipiscing<ins>s</ins> elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim <ins>do</ins>
- Ut enim ad minim <ins>Lorem</ins>veniam<ins>ipsum</ins>
- dolor sit amet, consectetur <del>adipiscing</del>

Known issues

The tool fails to capture changes in Docx files with text organized in tables (e.g., pdf2docx converts columns to tables).

References

This project takes inspiration from: