Life as a maintainer after the xz utils backdoor hack 👉 Watch now!

arche
Release 0.3.6

Analyze Scrapy Cloud data

Keywords: scrapinghub, scraping, data, data-visualization, data-analysis, pandas, jupyter, python3, scrapy
License: MIT
Install: pip install arche==0.3.6

Documentation

Arche

pip install arche

Arche (pronounced Arkey) helps to verify scraped data using set of defined rules, for example:

Validation with JSON schema
Coverage (items, fields, categorical data, including booleans and enums)
Duplicates
Garbage symbols
Comparison of two jobs

We use it in Scrapinghub, among the other tools, to ensure quality of scraped data

Installation

Arche requires Jupyter environment, supporting both JupyterLab and Notebook UI

For JupyterLab, you will need to properly install plotly extensions

Then just pip install arche

Why

To check the quality of scraped data continuously. For example, if you scraped a website, a typical approach would be to validate the data with Arche. You can also create a schema and then set up Spidermon

Developer Setup

pipenv install --dev
pipenv shell
tox

Contribution

Any contributions are welcome! See https://github.com/scrapinghub/arche/issues if you want to take on something or suggest an improvement/report a bug.

Dependencies: 0
Dependent packages: 0
Dependent repositories: 0
Total releases: 9
Latest release: Jul 12, 2019
First release: Mar 19, 2019
Stars: 11
Forks: 7
Watchers: 10
Contributors: 4
Repository size: 27.1 MB
SourceRank: 10

Source repo 2FA enabled: TEXT!
Package manager 2FA enabled: TEXT!
Is security responsive: TEXT!
Dependencies are managed: TEXT!
Issue-free release available: TEXT!
Succession plan available: TEXT!
Package manager 2FA enabled: TEXT!

Releases

0.3.6: Jul 12, 2019
0.3.5: May 14, 2019
0.3.4: May 6, 2019
0.3.3: May 3, 2019
0.3.2: Apr 18, 2019
0.3.1: Apr 13, 2019
0.3.0: Apr 12, 2019
2019.3.25: Mar 26, 2019
2019.3.18: Mar 19, 2019

Contributors

See all contributors

Something wrong with this page? Make a suggestion

Export .ABOUT file for this package

Last synced: 2023-11-30 01:05:31 UTC

arche
Release 0.3.6

Release 0.3.6

0.3.6

0.3.5

0.3.4

0.3.3

0.3.2

0.3.1

0.3.0

2019.3.25

2019.3.18

Documentation

Arche

Installation

Why

Developer Setup

Contribution

Stats

Development practices

Releases

Contributors

arche Release 0.3.6

Release 0.3.6 Toggle Dropdown 0.3.6 0.3.5 0.3.4 0.3.3 0.3.2 0.3.1 0.3.0 2019.3.25 2019.3.18

Documentation

Arche

Installation

Why

Developer Setup

Contribution

Stats

Development practices

Releases

Contributors

arche
Release 0.3.6

Release 0.3.6

0.3.6

0.3.5

0.3.4

0.3.3

0.3.2

0.3.1

0.3.0

2019.3.25

2019.3.18