Simple NLP Pipelinining based on a file system

NLP, pipelining
pip install nlpipe==0.53


Build Status

NOTE: This repository is obsolete

I'm working to replace nlpipe with a more standalone and much simpler system (i.e. less dependency on AmCAT/elastic/celery), see vanatteveldt/nlpipe

Simple NLP Pipelining based on elastic + celery

NLPipe is a very simple caching NLP pipelining system built on elasticsearch (backend) and celery (job management)

To use it, define one or more tasks based on module.NLPipeModule. A task should convert input (raw text or the result of earlier processing) to output. Input documents should be in the elasticsearch store, and output will be placed there.

A calling application can then ask for the results of one or more documents. If the documents are already processed (cached), the result is immediately returned. Otherwise, a processing task is placed on the celery queue.

Inspired by xtas


Install directly from github:

pip install git+git://

To run NLPipe, you need elasticsearch and rabbitmq, both of which can be installed directly using apt:

sudo apt-get install elasticsearch rabbitmq-server


Configuration is contained in the nlpipe/ and nlpipe/ modules. System (site) settings can be set using environment variables, in particular:

  • NLPIPE_ES_HOST - elasticsearch host (default:localhost)
  • NLPIPE_ES_PORT - elasticsearch port (default:9200)
  • NLPIPE_BROKER_HOST - rabbitmq host (default:
  • NLPIPE_BROKER_PORT - rabbitmq port (default: 5672)