sklearn-deltatfidf

DeltaTfidfVectorizer for scikit-learn


Keywords
sklearn, scikit-learn, tfidf, deltatfidf, delta, delta-tf-idf, python, sentiment-analysis, tf-idf
License
MIT
Install
pip install sklearn-deltatfidf==0.3

Documentation

sklearn-deltatfidf

DeltaTfidfVectorizer for scikit-learn.

The Delta TFIDF is suggested in a article by Justin Martineau and Tim Finin, and usually associated with sentiment classification or polarity detection of text.

Usage

from sklearn_deltatfidf import DeltaTfidfVectorizer

v = DeltaTfidfVectorizer()
data = ['word1 word2', 'word2', 'word2 word3', 'word4']
labels = [1, -1, -1, 1]
v.fit_transform(data, labels)

# you can use it in pipelines as usual
pipe = Pipeline([
      ('vectorizer', DeltaTfidfVectorizer()),
      ('clf', svm.LinearSVC())
  ])
pipe.fit(data, labels)

Installation

With pip:

$ pip install sklearn-deltatfidf

From source:

$ git clone https://github.com/r-m-n/sklearn-deltatfidf.git
$ cd sklearn-deltatfidf
$ python setup.py install