bedrock

Bedrock is a high-level text pre-processing API, written in Python and can run on NLTK or Spacy as its backends.


Keywords
nlp, pre-processing, text
License
MIT
Install
pip install bedrock==0.1.0.dev10

Documentation

Bedrock

Build Status

You have discovered bedrock

Bedrock is a high-level text pre-processing API, written in Python and can run on NLTK or Spacy as its backends. It allows you to quickly perform the text processing groundwork without having. It does the menial work, so you don't have to.

Use this library if you find the following highlights useful:

  • Fast prototyping
  • Switching between different backends
  • Work in batches, rather than writing loops
  • Support for DataFrame inputs/outputs

Install bedrock in a jiffy:

pip install bedrock
bedrock download all

From zero to bedrock hero in 10 seconds

Now you can run

import bedrock
bedrock.process.pipeline('Hallo Welt')

Congrats! 🎉

Engines and Languages

Currently bedrock supports the following engines:

  • spacy
  • nltk

And the following languages and corresponding download arguments:

  • English ('en' or 'english')
  • German ('de', 'german' or 'deutsch')

Installation and usage

Package installation

pip install bedrock

Install support for all languages:

bedrock download all

Install support only for English:

bedrock download en

Install support for German:

bedrock download de

Import modules from package in your code:

from bedrock import process    # Processing texts
from bedrock import collection # Loading data collections
from bedrock import common     # Some common functions
from bedrock import feature    # Feature extraction
from bedrock import viz        # Visualizations