shears

Extract illustrations from book page scans


Keywords
machine-vision, computer-vision
License
MIT
Install
pip install shears==0.0.3

Documentation

Shears

Extract pictures from historical book scans.

Installation

pip install shears

Basic Usage

Suppose you want to extract the image content within the following page scan:

Sample book page scan

Assuming you have saved the page scan to your current working directory, you can extract the image content with the following:

import shears

# extract the image content
result = shears.clip('input.jpg')

# show the extracted image
shears.plot_image(result)

# save the extracted image
shears.save_image(result, 'result.jpg')

This returns and saves the following image:

Sample cropped illustration

That's all it takes! The examples below show how to process more complex input images.

Processing Book Scans

Suppose you want to extract the illustration content from the page scan below:

Sample book page scan

To extract illustrations in pages like this, one can pass filter arguments to shears:

import shears

# use the filter parameters to pull out the illustration on a page
result = shears.clip(i,
                      filter_min_size=900,
                      filter_threshold=0.8,
                      filter_connectivity=1)

# show the extracted illustration
shears.plot_image(result, 'Extracted Image')

This returns the following image:

Sample cropped illustration

For additional examples, please see the sample notebooks in this repository.

Testing

To run the test suite, one can run:

pytest