Shears

Extract pictures from historical book scans.

Installation

pip install shears

Basic Usage

Suppose you want to extract the image content within the following page scan:

Assuming you have saved the page scan to your current working directory, you can extract the image content with the following:

import shears

# extract the image content
result = shears.clip('input.jpg')

# show the extracted image
shears.plot_image(result)

# save the extracted image
shears.save_image(result, 'result.jpg')

This returns and saves the following image:

That's all it takes! The examples below show how to process more complex input images.

Processing Book Scans

Suppose you want to extract the illustration content from the page scan below:

To extract illustrations in pages like this, one can pass filter arguments to shears:

import shears

# use the filter parameters to pull out the illustration on a page
result = shears.clip(i,
                      filter_min_size=900,
                      filter_threshold=0.8,
                      filter_connectivity=1)

# show the extracted illustration
shears.plot_image(result, 'Extracted Image')

This returns the following image:

For additional examples, please see the sample notebooks in this repository.

Testing

To run the test suite, one can run:

pytest

shears
Release 0.0.3

Release 0.0.3

0.0.3

0.0.1

Documentation

Shears

Installation

Basic Usage

Processing Book Scans

Testing

Stats

Development practices

Releases

Contributors

shears Release 0.0.3

Release 0.0.3 Toggle Dropdown 0.0.3 0.0.1

Documentation

Shears

Installation

Basic Usage

Processing Book Scans

Testing

Stats

Development practices

Releases

Contributors

shears
Release 0.0.3

Release 0.0.3

0.0.3

0.0.1